TRANSACTIONS 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


ROBERT D. CARMICHAEL 
FRANCIS R. SHARPE 


JACOB D. TAMARKIN 


WITH THE COOPERATION OF 


ERIC T. BELL EDWARD W. CHITTENDEN WILLIAM C. GRAUSTEIN 
OLIVE C. HAZLETT EINAR HILLE AUBREY J. KEMPNER 
JOHN R. KLINE ERNEST P. LANE CHARLES N. MOORE 
MARSTON MORSE GEORGE Y. RAINICH JOSEPH F. RITT 
CAROLINE E. SEELY CHARLES H. SISAM MARSHALL H. STONE 


VOLUME 36 
1934 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1934 


OF THE 
j 
| 
h 
p 
4 
Q re 


COMPOSED, PRINTED AND BOUND BY 
The Collegiate Press 
GEORGE BANTA PUBLISHING COMPANY 
MENASHA, WISCONSIN 


51090 


C 
toni tra | 


TABLE OF CONTENTS 
VOLUME 36, 1934 


Apams, C. R., and Ctarkson, J. A., of Providence, R. I. Properties of func- 
tions f(x, 9) of bounded variation 

ALBERT, A. A., of Princeton, N. J. Normal division algebras over a modular 
field . 


On normal Kummer fields over a nen-modular field . 

BELL, E. T., of Pasadena, Calif. On the power series for elliptic functions . 

BERNSTEIN, B. A., of Berkeley, Calif. A set of four postulates for Boolean 
algebra in terms of the “implicative” operation 

Brawana, H. R., of Urbana, Ill. Metabelian groups of order pum with 
commutator subgroups of order p™. . 

Brown, A. B., and Koopman, B. O., of New York, N. 'Y. “The Riemens 
multiple- epace and algebroid Sanctions ‘ 

CaMERON, R. H., of Ithaca, N. Y. Almost periodic transformations 

CiaRKSON, J. A. Apams, C. R., of Providence, R. I. of func- 
tions f(x, y) bounded vaiintion 

Dramonp, A. H., of Berkeley, Calif. Correction to a paper on the White- 
head- Huntington postulates . 

Dickson, L. E., of Chicago, Ill. Waring’s problem for cubic functions . 

A new method for Waring theorems with polynomial summands . 

Doos, J. L., of New York, N. Y. Probability and statistics . 

Dye, L.A. and SHARPE, F. R., of Ithaca, N. Y. The Bertini transformation 
in space . 

EIESLAND, J., of Ww. Va. The in Ss with 
a Schlafli hexad 

Gore, G. D., of Chicago, Il. Inscribed sequences of surfaces associated 
with generalised sequences of Laplace ‘ 

GrausTEn, W. C., of Cambridge, Mass. The geometry of Riemannian 
spaces 

GrovE, V. G., of East Lansing, Mich. On a ‘certain correspondence between 
surfaces i in hyperspace. 

HEsTENES, M. R., of Cambridge, Mass. Sufficient conditions for the prob- 
lem of Bolza in ‘the calculus of variations , 

HILDEBRANDT, T. H., of Ann Arbor, Mich. On bounded linear fanctional 
operations 

J. J. L., of lows. On the problem n bodies 

James, R. D., of Pasadens, Calif. The value of the number “or in Waring s 

problem 


PAGE 


711 
388 
885 
841 
876 
776 


618 
276 


Qh 
\ 
F 00! | 

711 

893 

1 i 

731 

759 i 

292 

315 

530 

542 

627 

793 | 

868 i 

306 

395 


iv TABLE OF CONTENTS 


Jerrery, R. L., of Wolfville, Nova Scotia. Derived numbers with respect 
to functions of bounded variation 

KE toce, O. D., of Cambridge, Mass. Converses of Gauss’ theorem on . the 
arithmetic mean 

Koopma\, B. O., and Brown, A. B. New York, N. “The Riemann 
multiple-space and algebroid functions 

Lange, E. P., of Chicago, Ill. The moving trihedron , 

LANGER, R. E. , of Madison, Wis. The asymptotic solutions of contain Hacer 
ordinary differential equations of the second order ; 

The solutions of the Mathieu equation with a complex variable and 
at least one parameter large . 

McCoy, N. H., of Northampton, Mass. On quasi-commutative matrices . 

MacCo tt, L. A. , of New York, N. Y. On the distributions of the zeros of 
sums of exponentials of polynomials . ; 

M.L., of Memphis, Tenn. A projective generalisation of metri- 
cally defined associate surfaces : 

Miter, G. A., of Urbana, Ill. Groups in which the squares of the elements 
are a dihedral subgroup ‘ 

VON NEUMANN, J., of Princeton, N. J. Almost periodic functions i ina sgreun. 

NEwrTon, A. V., of Chicago, I. ‘Consecutive covariant configurations ata a 
point of a space curve 

Ore, O., of New Haven, Conn. Contributions to the theory of finite fields 

Errate i in my paper “On a special class of polynomials” ‘ 

RAUDENBUSH, H. W., of New York, N. Y. Ideal theory and algebraic dif- 
ferential equations . 

REGAN, F., of St. Louis, Mo. The application of the theory of admissible 
numbers to time series with constant probability . 

Rosskopr, M. F., of Clayton, Mo. Some inequalities for sen-unilermly 
bounded crtho-nermal polynomials. . 

RussELL, H. G., of Wellesley, Mass., and WALsH, J. L., of Cambridge, 
Mass. On the convergence and overconvergence of sequences of poly- 
nomials of best simultaneous approximation to several functions analytic 
in distinct regions . 

SEIDEL, W., of Cambridge, Mass. On the distribution of values of bounded 
analytic functions 

SHARPE, F. R., and Dye, L. A. , of Ithaca, N. } “The Bertini transforma- 
tion in space ‘ 

Srumons, H. A., of Evanston, Il. The first and second variations of a: an n- 
tuple integral 1 in the case of variable limits. ' 

Watsu, J. L., of Cambridge, Mass., and Russe.t, H. G., a Wellesley, 
Mass. On the convergence and overconvergence of sequences of poly- 


749 
227 
618 
696 

90 
637 
327 
341 
826 
819 
445 

44 
243 
275 
361 
511 
853 

13 
201 
292 

29 


TABLE OF CONTENTS 


nomials of best simultaneous approximation to several functions analytic 
in distinct regions 
WeszeeER, G. C., of Chicago, I. Waring’ s problem for cubic functions . . 493 
Warrney, H., of Cambridge, Mass. Analytic extensions of differentiable 
functions defined i inclosed sets . . . 
Differentiable functions defined in closed sets. I — 369 
Winper, D. V., of Cambridge, Mass. The inversion of the Laplace integral 
and the related moment problem .. . 107 
ZYGMUND, A., of Vilna, Poland. Some points in the theory of trigonometric 
Portrait of E. H. Moore Frontispiece 


Vv 
id 
i 


\ 


sELIAKIM HASTINGS MOORE 
January 26, 1862-December 30, 1932 


First Editor-in-Chief of these TRANSACTIONS 


WARING’S PROBLEM FOR CUBIC FUNCTIONS* 


BY 
L. E. DICKSON 


1. INTRODUCTION 


In 1921 it was proved by Kamke that if f(x) is a polynomial with rational 
coefficients whose value is an integer 20 for every integer x=0, then every 
integer 20 is a sum of a limited number ~ of 1’s and a limited number 2 of 
values of f(x) for integers x20. This existence theorem was later proved by 
the method of Hardy and Littlewood by Winogradow and Landau. 

For the case of any quadratic function, the writer (and later Dr. Pall) 
evaluated the limits ~ and v. 

We shall here treat cubic functions (1). The case in which a term x? occurs 
is under investigation by my students. The main result is Theorem 2. For 
special cubic functions, Theorems 4 and 5 give universal Waring theorems. 


2. DETERMINATION OF ALL FUNCTIONS (1) WITH CERTAIN PROPERTIES 


We restrict attention to cubic functions of the form 
(1) x) = (a #0,d>0), 


(2) a, 8, d integers without a common factor > 1. 


We assume that f(x) is an integer for every integer x20. By the values 1 
and 2 of x, we see that a+8 and 8a + 28 are divisible by d, whence 


(3) 6a and 68 are divisible by d. 


If d has a prime factor p>3, then a and £ are divisible by p, contrary to (2). 
Hence 2 and 3 are the only possible prime factors of d. 

To discuss only a pure Waring problem, we assume that f(x) =0 if x is 
any integer x =0, and that 1 is a value of f(£) for a certain integer  >0 (other- 
wise, sums of values of f would never give the number 1). 

I. Case d a multiple of 6. Write d=6t. By (3), a and B are divisible by #, 
whence ¢=1, by (2), and d=6. By f(é) =1, 


(4) at + BE = 6, 


whence £ is a positive divisor of 6. 


* Presented to the Society, June 19, 1933; received by the editors June 23, 1933. 
1 


2 L. E. DICKSON [January 


I,. Let £=3w, w=1 or 2. By (4), 9aw*+fw =2. Elimination of 6 gives 


1 2 — Yaw? 


w 
f(w) =4(1 — 4ew*) > 0,a 50; f(6w) = 2+ 27aw* = 0, 


whence a=0, contrary to hypothesis. Thus Case I, is excluded. 
I,. Let §=2. By (4), 4a+8=3. Then 


fi) = 20,a51; f(3) = 9) 20,020, 
whence a=1, and f(x) is the pyramidal number 
(S) P(x) = 3(x* — 2). 


This function satisfies all of our preceding assumptions. 
I;. There remains only the case §=1. Then a+6=6, 


(6) fla) = 2+ — 2) = 2+ aP(2), a > 0, 


which is an integer 20 for every integer x20, since the same is true of P(x). 
II. Case d a multiple of 3, but not of 6. Then d= 3, where ¢ is a positive odd 
integer. By (3), a and @ are divisible by ¢, whence ¢=1 by (2). Thus 


f(x) = 3(ax* + Bx), + BE = 3, 


whence £ is a positive divisor of 3. 
II,. Let £=3. Then B=1—9a, 


= — 8a) 2 0, a 0; 
f(4) = 3(4 + 28a) 2 0,a20. 


Since a0, Case II, is excluded. 
II,. Hence B=3—a, 


(7) f(x) = = (x? — x) = ++ 2aP(x), a>O0. 


III. Case da multiple of 2, but not of 3. Similarly as in II, we find that d=2, 
£=1,a+6=2, 


(8) f(z) = 2+ (x? — x) 2+ 3aP(2), a > 0. 


IV. d=1. Then §=1, a+8=1, f(x) =x+6aP(x). 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 3 


THEOREM 1. If f(x) =(ax*+6x)/d (a0) is an integer =0 for every integer 
x20, and if f(x) =1 for some integer x>0, then f(x) is either a pyramidal num- 
ber P(x) =}(x*®—x), or is x+ P(x), where €is a positive integer. Conversely, each 
of the resulting functions is an integer =O for every integer x=0, while f(x) =1 
has a positive integral solution. 


3. THE MAIN THEOREM AND THREE LEMMAS 


THEOREM 2. To each positive integer € prime to 3 there correspond positive 
integers C and v=8 such that every integer =>C-3* is a sum of nine values of 


(9) f(x) = x + — x) 


for integral values 20 of x. 


We shall first give the parts of the proof which hold both for «=1 and 
e>1, and then establish the few simple additional facts required in the case 
e=1, and hence prove Theorem 2 for e=1 with C=168, y=8. Then we shall 
present the more elaborate theory for «>1 (which does not hold for ¢=1). 
That theory gives a reconstructed proof which provides an explicit program 
actually to express any sufficiently large integer as a sum of nine values of 


Lema 1. There exists an integer m' such that any given integer is congruent 
to f(3m’) modulo 3”. 

The difference f(z+3r) —f(z) has the value 
(10) A = $e(3rz? + 9r2z + 97°) + 3r — fer. 


Since 3 is a factor of all terms except the last, while ¢ is prime to 3, A#0 if 
r#0 (mod 3”). Take r=m’—k, 2=3k, 0<r<3". Then 


f(3m') — f(3k) = f[3k + 3(m’ — k)] — f(3k) #0 (mod 3"). 


Hence for 7 =0, 1, - - - , 3"—1, the 3” integers f(3/) are incongruent modulo 
3”, so that any integer is congruent to one of them. 


Lemma 2. If is an odd constant integer, v(n—v) is even and can be made 
congruent to any assigned even integer modulo 2* by choice of an integer v. 


Let V(n—V)=v(n—v) (mod 2*). Then the product of V—v by V+0—7 
is divisible by 2*, while the factors are of unlike parity. Hence one factor is 
odd and the other is divisible by 2*. Thus V =v or n—» (mod 2*). Hence when 
v ranges over the 2* values 0, 1, - - - , 2-1, we obtain at most (and hence 
exactly) }-2* values of 0(7—v) incongruent modulo 2+, and the latter values 
are all even. This proves Lemma 2. 


+4 L. E. DICKSON [January 


Lemma 3. If n>1 and m<e-3", then f{(3m) <y-3", where 
(11) = + 1). 
Since 9m*—m increases with m, 
f(3m) = 3m + — m) < 363" + — 63"), 


which will be <73* if 3e—4¢«?<33". The latter holds for every if «26 
and holds for m>1 if e<5, since the maximum of 6€—e? is its value 9 for 


e= 3. 
4. PLAN OF PROOF WITH THE NECESSARY FORMULAS 


If s and C are given positive numbers, we can evidently choose a positive 
integer 1 so that 
<C-27"*1, 


In Theorem 2, s=>C-27’. Hence we may take n2v28. 
Any such s is one of the integers s; falling in the following three sub-inter- 


vals: 
< (i = 1, 2, 3). 
By Lemma 1 we can choose an integer m; so that 
= + 3*M;, m< 

where M; is an integer. Since f(m,;) 20, 3°M;<s;<3'C3*. Using also Lemma 
3, we get 
— 7)3** < M; < 3'1C3*. 
Write M;= «3**+N;. Then 
(12) — — €)3**° < Ny < (3°C — €)3™* = 1, 2, 3). 

Take / = 3" in the identity 

— + x) = 2 + — 1 + 3lx*) 


and sum for three values x, of x. Thus 


— x) + + x,)] = 7, 


j=1 
T = + 3"(O — e+ 6), Q = x? + + x? . 
Write ¢; for f(v;) +/(w,) and Q; for Q. Then will 
$i = + + Ni) = f(3mi) + oi + T 


(and hence s; will be a sum of nine values of f) if 


a 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 5 


(13) 3"(N;i +e — 6) = 5 + 630; (i = 1, 2, 3). 
We impose on the unknowns 2;, w; the restrictions 
(14) + w; = 35,3" (i = 1, 2, 3; a positive odd integer). 


The identity 

$i = + wil + + — 30,0; — 1} | 
gives ¢;=3"B;, where 
(15) B, = 3b; + 1 — 3n,(3b,3" — |. 


Inserting ¢;=3"B; into (13) and cancelling 3", we get 


(16) 0; = Nite —6— Bi. 
We shall later choose the 2; so that 

(17) 0S S 363", OS Nite —6— BS 3". 

These and (14) and (16) imply 

(18) Osw, 050; 3. 


We shall later prove that we may take Q; to be an integer which is a sum 
of three squares of integers x, =0. Thus x; = 3* by (18). It will then follow that 
s; is the sum of the values of f(x) for the nine values 3m,, 0;, wi, 3*—x,, 3°+-; 
of x, each an integer 20. 

Employ the abbreviation 


3 
(19) V; = — — 0,3". 
2 
In the right member of (16), we insert the value of B; and get 
9 
1 +<(sv2 + — 1)|. 
The final condition (17) is 0<.S;< «3*". Now S;20 if 3A;=>V?, where 
9 
20 A; = —| ———_ - 11] — — + 1. 
(20) | 3); | 4 


Hence if 
(21) A;20, Vi20, ($A,)!? 2 


6 L. E. DICKSON [January 


Next, S;< if V2, where 
(22) G; = A; — 2-3*"/b;, 
and hence if 
(23) G;=0, Vi20, GG)" Vj. 


If we assume that G;=0 (whence A;=0), as well as 
3 3 3 


we see that (21) and (23) follow and that v;< 35,3", whence (17) hold. 
By using the values (20) and (22) of A; and G;, we see that condition 
G;2=0 and the final inequality (24) are equivalent to - 


9 
where 


9 
(25) Li = eb? Bi, Bi = 143 <) +6—e= (1 + 
This inequality will evidently follow if /; is <the lower limit in (12) and if 
L;is = the upper limit in (12), and hence if 


(26) + s3'Cs “ = 1, 2, 3). 
8 32m 2 3° 
When ¢=1, 28, inequalities* (26) all hold if 
(27) =5, be = 7, 11, C = 168. 


Since we shall need to assign to v; a prescribed residue modulo 8, we desire 
that at least 8 consecutive integral values of v; satisfy the first inequality 
(24) for every i=1, 2, 3. The difference between its limits is 


D; = — 


Write u;=2-3°"/(b;A,). By (22) and (23:), 0<y;<1. Thus D; is the product 
of (3A,)"? by 
Mi Mi 
1 am 1 1/2 > 


Hence 


* The limits for C are approximately 147, 188 if i=1; 130.8, 172.6 ifi=2; 167.1, 221.9 if i=3. 
Since we desire a minimum C independent of i, we take C= 168. 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 7 


32" 


(28) D;> 


By (12) and (20), 
— 


29 3A; 5 — 
(29) € 3d; 


27 
— bast + 3. 


5. PRoor oF THEOREM 2 WHEN e=1 
When e=1, (27)-(29) give 
3A, < 435-32" < 212-3, 
D, > 3*/105 > 62, 


and D,>29, D;>14, whence each D;>8. By (15), 

(30) 2B; — 6b; = beeF, 

where F denotes the quantity in { } in (15). By Lemma 2, we can choose 
v; (mod 8) so that F is congruent modulo 8 to any assigned even integer. 
Thus in (16) we can choose v; (mod 8) so that 2e0;=2z (mod 8), where 2 is 
an arbitrary integer. Take z=¢e=1. Then Q;=1 (mod 4). But Q;>0. Hence 
Q; is a sum of three integral squares. This proves Theorem 2 when e=1, with 
C=168, v=8. 


6. PRooFr OF THEOREM 2 WHEN € IS PRIME TO 6 AND e>1 
We shall first determine by, be, bs, C, vy so that all three inequalities (26) 
hold when 2», viz., 
(31) S; = 1, 2, 3). 


Minimum values of C and b; may be found by the following scheme. Take 
b; to be the least positive odd integer for which J; < S,;. Take C to be the least 
integer 2J,. Take bz and b; to be the least positive odd integers for which 
S223C, Ss29C. We find that 
13 19 27 15182 
& 49510 
11 | 27 39 57 309485 
13| 31 45 65 564244 


For these values we find that J2<3C, J;<9C, whence (31) hold. 


8 L. E. DICKSON [January 


For a general ¢, we shall choose 0; to be a linear function of ¢ which has 
the value in the tablette when e=5, - - - , 13, and is such that (31) hold as 
regards the coefficients of the highest power of e. 
A. For e=6e+1, we take* b;=14e+3 and get 


(32) C = 24354e* + 18882e* + 5508e? + 728 + 38. 


Then hh <C 

Ai. If e is odd, take b: = 21e+4. We find that J, is termwise (as to coeffi- 
cients of powers of e) less than 3C, and that S; is termwise >3C. Taking 
bs = 29e+6, we find that 7; <9C<S;. 

Ae. If is even, e>0, take =21e+3, b; =29e+7. Since b; exceeds in 
Case evidently 5;>9C. Similarly, J.<3C. Computation gives S;>3C, 
I;<9C, since e=2. 

B. For e=6e—1, take };=14e—1. Then 


(33) C = 24354e* — 10944e? + 1917e? — 150e + S, 


and I 1 <C <§ 1- 

By. If e is odd, take b= 21e—2, bs =29e—2. For every e=1 computation 
gives Ss=9C. See Bis. 

Bz. If eis even, we accent the letters 6, J, S. Take bf =21e—3, bf =29e—1. 
Computation gives Sf 23C if e=2, Ij $9C for e=1. See Br. 

Bis. Since bd <be, b3<bj , we have I¢ <JI2, Sf? S3<Sj. The 
results proved in Cases B, and B, therefore imply I7 <3C, I; <9C, Sj >9C, 
and also S:>3C if e=2 (while S:=3C by the remark above our tablette). 
These inequalities together with those in B,; and Bz give all the inequalities 
(31). 

By (29) and the square of (28) we see that D;>8 if 3" exceeds a certain 
function of e and i, and hence if m is sufficiently large. 

If s; is any given integer, Lemma 1 shows the existence of an integer m{ 
such that 


si = f(3m{) + 3°M/,0 mi < 
where M{ is an integer. In (10) take z=3m!, r=3*y;. Since A=3r (mod €) 
and since A has the factor r and hence 3", we have A= 3"E. Since ¢ is prime to 
3", we get E=3y; (mod e). Write 


m, = mi + 3*y;,,M; = Mj — E. 
Then 
f(3m,) — f(3m!) = A = s; = f(3m,) + 
~~ * Actually the least odd 5; when e=1, 2, 3, or 4. 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 9 


Since ¢ is prime to 3, we can choose integers ; so that 
Mi — (mode), OS 


The last inequality shows that the maximum m; is 3*—1+3*(e—1) =€-3"—1. 
Hence 0 <m;<€-3", as desired in Lemma 3. 
As before, we write = €3°*+N;. Thus 


Ni = M} 3yi; 
N;-—6— 3),=0 (mod e). 


It has been noted that the quantity in { } in (15) is even, whence 
B,=3b; (mod ¢). Hence 


Nite—6-—B;=0 (mod e). 


Hence (16) yields an integral value of (Qi. 
We proved that there exist more than 8 consecutive integral values of 9; 
which satisfy inequalities (24). By (15), 


2B; = 6); — 0;) (mod 8), n = 36;3". 


By Lemma 2, we can choose 2; (mod 8) so that v;(7—v,;) =2¢; (mod 8), where 
¢; is any assigned integer. Then 


Nite—6— B= Ni +e — 6 — + (mod 4). 


We choose ¢{; so that the second member is =e (mod 4). Then (16) gives 
Q;=1 (mod 4). 

Since inequalities (26) were satisfieci at the outset, we know that G;=0 
and that the final inequality (24) holds. Also (24:) was shown to hold. Hence 
(17) hold. By (172), OS «0; «3. Since Q; is an integer 20 such that 0;=1 
(mod 4), it is well known that Q;=)-'.:«,7, where the x, are integers =0. 
But Q;<3*", whence each x, <3". 

In view of (17;), the integer w; defined by (14) is 20. Then f(2;) +f(wi) =¢; 
has the value 3*B;. Hence (16) yields (13), which is the condition that s; be 
the sum of the values of f(x) for the nine values 3mj, v;, wi, 3*—x,, 3"+2, of 
x, each an integer 20. 


7. PRooF OF THEOREM 2 WHEN €=285, 5 PRIME TO 3 


We shall determine the 5; and C to satisfy (26), viz., (31). For the follow- 
ing four values of 5, our later results do not apply: 


L. E. DICKSON 


2 
5 45 145900 
8 77 


In the preceding tablette, the 6; and C have their minimum values and all 
inequalities (31) hold. 
J. When 6=3d+1, d21, we take 6,=14d+5 and get 


(34) C = 24354d* + 33795d* + 17591d* + 4082d + 358, 


and find that 
Ji. If d is odd, take b2=21d+8, b; =29d+12. Then S:=3C, I; <9C. 
Jz. If d is even, take =21d+7, 6; =29d+11. Then I2<3C, S3>9C. 
Jiz. The remaining inequalities (31) follow from J; and Jz as in Bu. 
K. When 6=3d—1, d24, take },=14d—5. Then 


(35) C = 24354d* — 33795d* + 17591d* — 4058d + 350. 


K,. If d is even, take b.=21d—9, 6; =29d—9. Then J253C, S;29C for 
d21. 

Ko. If d is odd, take b:=21d—10, bs: =29d—8. Then S:=3C, for 
d=4. 

Ky. Exactly as in Biz, the remaining inequalities (31) follow from Ki, Ke. 

Let ¢=2°E, where E is odd and e=1. By Lemma 2, we can choose 2; 
(mod 8) so that the number in { } in (15) is =2z; (mod 8), where 3; is 
arbitrary, whence B;=3);+b,«(z:+4u,;), where u; is an integer. Write 
¢;=),Ez;, so that ¢; has a preassigned residue modulo 4, and B;=36;+ 
2°(€:+4b,Eu,). 

By choice of y;, we made V;+¢—6—30; a multiple of €, say 2°-g;. Then 


N; e—6-— B; = 2°F;, F; = 4b;Eu;. 


We may choose {; so that F;=E (mod 4). Then by (16), «Q;=2*E (mod 
4-2°), whence 0;=1 (mod 4). But Q; is an integer 20. Hence Q; is a sum of 
three squares. 


8. REDUCTION OF D; FROM 8 TO 6 


The lower limit v of m obtained from D;=8 may be reduced in certain 
cases by using D;26. 

First, it suffices to have D;=>7. Then (24) holds for 7 consecutive integral 
values of v,. If f is the first of them, then the seven together with f—1 form 


10 Pt [January 
1025 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 11 


a complete set of residues modulo 8. Two values of » whose sum is 7 = 35,3" 
yield the same value of P=v(n—v). The value of P which is apparently 
lacking in view of the missing value f—1 of v is actually obtained from the 
value u=n—f+1 of v, and u#Af—1 since »#2f—2 (mod 8), 7 being odd. 

Second, it suffices to have D;2=6. Let f be the first of six consecutive inte- 
gral values of v. Then the six together with f—1 and f—2 form a complete 
set of residues modulo 8. We saw that the two values f—1 and u=yn—f+1 of 
v give the same value of P; likewise f—2 and w=n—f+2. Each of u=f—2 
and w=f—1 (mod 8) reduces to 


(36) a-3=n (mod 8). 


Hence if (36) does not hold, we may employ u and w instead of the missing 
values f—1 and f—2 of v, as well as the four residues which together with 
these four make a complete set of residues modulo 8, and obtain all four even 
residues of 8 as values of P. 

When (36) holds, we modify our proof as follows: We no longer secure the 
value P’=(f—1)u=(f—2)w=(f—1) (f—2) (mod 8) of P. But when » ranges 
over six incongruent residues, no one congruent to f—1 or f—2 modulo 8, 
we obtain three incongruent even values of P (viz., all except P’). In the 
notations at the end of §6, we may assign to {; any one of three values in- 
congruent modulo 4 (viz., any except 3P’), and hence choose {; so that 
«Q;=€ or 2€ (mod 4). Then Q;=1 or 2 (mod 4) and Q; is a sum of three 
squares. Similarly, in the notations at the end of §7, we secure F;=E or 
2E (mod 4), whence Q;=1 or 2 (mod 4). 


9. UNIVERSAL THEOREM €=2 


When ¢=2, 5=1, and we saw that (26) are satisfied if }:=7, b2=11, 
bs =15, C=1025. The older condition D;=8 requires n=9. The condition 
D;=7 barely fails to reduce » from 9 to 8. The best condition D;=6 holds 
if n=8. Then D.>14, D,>40. Hence Theorem 2 shows that every integer 
= 1025-3 is a sum of nine values of 


= + — x) = + 22) 


for integral values 20 of x. We employ 


THEOREM 3. Let a polynomial f(x) take integral values =0 for all integers 
x20; let f(x+1)—f(x) increase with x. Suppose that every integer n for which 
l<n<gt+f(0) is a sum of k—1 values of f(x) for integers x=0. Let m be the 
maximum integer for which f(m+1)—f(m)<g—l. Then every integer N for 
which 1+-f(0) <N sg+f(m-+1) is a sum of k values of f(x) for integers x20. 


12 L. E. DICKSON 


Tables were made showing all integers 1-3000 which are sums of 2, 3, 4, 
or 5 values of f(x). In particular, all integers 1-3000 except only 42 and 66 
are sums of 5 values. This result was proved to hold also for 3000-4000 as 
follows. A list was made of the integers 2076-3000 which are missing from the 
table of sums by 4. To this list we added 924=/(14) and subtracted 1135 
=f(15) from the sums (actually we subtracted 1135—924=211 from our 
list), and noted whether or not each difference is in the table of sums by 4. 
If not, we subtracted other values of f(x) from the sum. 

We may apply Theorem 3 to f(x)=4}(x'+2x) since f(x+1)—f(x) 
=x*+2x-+1 increases with x. As just proved by tables, every integer for 
which 66 <m < 4000 is a sum of 5 values of f(x). The maximum integer m for 
which 

(2m + 1)? < 4(4000 — 66) — 3 = 15733 

has 2m+1=125, m=62. Since f(63) =83391, Theorem 3 shows that all in- 
tegers from 67 to 87391 are sums of 6 values of f(x). The same is true of 42 and 
66. Hence we may apply Theorem 3 with /=0, g = 87391, k=7; the maximum 
m is 295: hence all integers <g+/(296) =8732367 are sums of 7 values of 
f(x). The next maximum m is 2954, and all integers £8609777000 are sums 
of 8 values of f(x). The next maximum m is 97788, and all integers $311716 
10° are sums of 9 values of f(x). This product exceeds 1025 - 3% = 2894902749 
105. This proves 

THEOREM 4. Every integer =0 is a sum of 9 values of 4(x*+2x) for integers 
x=0. 

Besides the functions treated, there arose in Theorem 1 the special pyra- 
midal function P(x). For this case, K. C. Yang proved in his Chicago doc- 
toral dissertation of 1928 

THEOREM 5. Every integer =0 is a sum of nine pyramidal numbers }(x* — x) 
for integers x=0. 

He verified that every integer <7240 is a sum of five values. 

I have not completed the provi that every integer is a sum of nine values 
of function (9) for e«=1. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


ON THE CONVERGENCE AND OVERCONVERGENCE UF 
SEQUENCES OF POLYNOMIALS OF BEST SIMUL- 
TANEOUS APPROXIMATION TO SEVERAL 
FUNCTIONS ANALYTIC IN 
DISTINCT REGIONS* 


BY 
J. L. WALSH anp HELEN G. RUSSELL 


1. Introduction. The purpose of this paper is to present some theorems on 
the convergence and overconvergence of sequences of polynomials of best 
approximation to a function f(z) analytic on a closed limited point set whose 
complement is of multiple (finite or infinite) connectivity. Our main theorem 
is the following: 


THEOREM I. Lei M be an arbitrary closed limited point sei of the z-plane 
whose complement K is connected and possesses a Green’s function with pole at 
infinity.| Let w=w(z) be a function which maps K (conformally but not neces- 
sarily uniformly) onto the exterior of the unit circle in the w-plane so that the 
points at infinity in the two planes correspond to each other. Let Cr denote the 
transform (i.e., in K) of |w| =R, R>1, under the mapping function w=w/(z). 

(1) If the function f(z) is analytic and single-valued on and within Cr, there 
exist polynomials P,(2) of respective degrees nt, n=1, 2,---+-, such that the 
inequalities 
(a) | f(z) — Pa(z)| N/R", son M,R>1, 


where N is dependent on R but not on n or 2, are valid for every 2 on M. 

(2) If there exist polynomials P,(2z) of degree n,n=1, 2, - - - , such that (a) 
is valid for every z on M, then the sequence {P,(z)} converges interior to Cr, 
uniformly on any closed point set interior to Cr, and thus f(z) is analytic§ 
throughout the interior of Cr. 


* Presented to the Society, October 29, 1932; received by the editors February 27, 1933. 

t The requirement that K should possess a Green’s function is equivalent to the requirement 
that K should be regular, in the sense that the Dirichlet problem (for arbitrary continuous boundary 
values) can be solved for K. See Kellogg, Proceedings of the National Academy of Sciences, vol. 12 
(1926), pp. 397-406. 

t A polynomial of degree n in z is any expression of the form ao+-a12-+-a22*+ + + + +-dn2". 

§ If f(z) is not originally assumed to be defined on the entire point set ciated, then the defi- 
nition in the new points is to be made by analytic extension interior to Cr, or, what amounts to the 
same thing, by means of the convergent sequence of polynomials. 


13 


14 J. L. WALSH AND H. G. RUSSELL [January 


The Green’s function G(x, y) with pole at infinity for the region K is 
(1) harmonic in K except at infinity where G(x, y)=log r+Gi(x, y), 
r= (x?+y*)!/2, and Gi(x, y) is harmonic at infinity, and (2) G(x, y) is con- 
tinuous and vanishes on the boundary of K. 

It will be noticed that the hypothesis on the point set M is satisfied pro- 
vided M is closed, limited, without isolated points, and provided K is con- 
nected and of finite connectivity. 

This theorem is known for the case that the complement of M is simply 
connected and that M is not a single point. The second part of the theorem 
for that case is due to Walsh and the formulation of the entire theorem to- 
gether with detailed references was published by him.* Among the writers to 
whom various parts of the theorem are due are Faber, S. Bernstein, M. Riesz, 
Fejér, and Szegé; the theorem for the case that M is a segment of the axis 
of reals is due to Bernstein. The generalization to be proved here is made pos- 
sible by the consideration of the equipotential curves for the infinite region 
K and of approximation by them to the boundary of K, by the approximation 
to analytic curves by lemniscates, and finally by the use of a sequence of 
polynomials found by interpolation. 

By means of Theorem I we shall derive some results on convergence and 
overconvergence,—results which are generalizations of results already es- 
tablished by Walsh in the less general case mentioned. We study also the 
convergence of sequences of polynomials of best approximation, where best 
approximation is measured (1) in the sense of Tchebycheff, (2) by line in- 
tegrals taken over rectifiable Jordan curves bounding the point set considered, 
(3) by surface integrals taken over the region considered. The two latter 
methods of approximation yield interesting results in regard to polynomials 
belonging to a point set, a problem which has been considered in the case of a 
simply connected region by Faber, Fejér, Szegé, Bergmann, Bochner, Carle- 
man, and Smirnoff.f All three methods of approximation yield results on the 
exact region of uniform convergence of the sequence of polynomials of best 
approximation and show that this region depends not merely on the singu- 
larities of the given function f(z) but also on the monogenic character of the 
function f(z). 

The term overconvergence is used in the sense of Walsh to denote that if a 
sequence of polynomials converges sufficiently rapidly on a point set M 
of the kind described, then that sequence necessarily converges also on a cer- 


* Miinchner Berichte, 1926, pp. 223-229. 
t These Transactions, vol. 32 (1930), pp. 794-816; these Transactions, vol. 33 (1931), pp. 370—- 


388. We shall refer to these papers as (1) and (2) respectively. 
t Detailed references are given below. 


1934] SEQUENCES OF POLYNOMIALS 15 


tain larger point set containing M in its interior. 
2. Approximation by analytic curves to the boundary of a given point set. 
We shall prove several lemmas. 


Lemma I. Let M be a closed limited point set of the 2-plane whose complement 
K is connected and possesses a Green’s function G(x, y) with pole at infinity. 
Then w= w(z) =e%+*#, where H is conjugate to G, is a function which maps K 
onto the exterior of the unit circle in the w-plane so that the points at infinity in 
the two planes correspond. 

The equipotential lines, G=c, c>0, take the following forms: (1) the locus 
G=c consists of a finite number of simple analytic closed curves, mutually ex- 
terior, bounding an infinite region T of points G>c; or (2) the locus G=c 
consists of a finite number of mutually exterior closed curves, at least one of 
which has a multiple point of order m= 2, bounding an infinite region T of points 
G>ec. 


Consider the set of points T: G>c, in which we count the point at infinity. 
Because G is continuous in K except at infinity, the boundary points of T all 
belong to the equipotential G=c. Conversely, all points of G=c are boundary 
points of T. If not, then in a neighborhood of a point P of G=c which is not 
a boundary point of T, we have only points G Sc. Since G is harmonic in this 
neighborhood, by Gauss’ mean-value theorem G equals ¢ on the circumfer- 
ence of a sufficiently small circle about P, and we have a contradiction. 

The set T is a region, that is, every point of the set G>c is an interior 
point of the set, and any two points of the set can be connected by a Jordan 
arc all of whose points belong to the set. Otherwise, a region 7, belonging to 
the set T exists not including the point at infinity and having G=c on its 
entire boundary. Since G is harmonic in 7,, G is identically equal to c in 7,, 
which leads to a contradiction. 

The locus G=c, c>0, consists of analytic arcs which fall into a set of 
closed curves; otherwise the continuity hypothesis is contradicted.* 

The locus G=c, c>0, consists of a finite number of curves. If M is bound- 
ed by a finite number of mutually exclusive closed point sets, the state- 
ment follows at once from the facts that any curve on which G=c>0 con- 
tains in its interior points of M and no two loci G=0 and G=c>0 have a 
common point. If M is bounded by an infinite number of mutually exclusive 
point sets, assume the curves G=c: C™, C®, - - - to be infinite in number 
and consider a point P,;on C™, P;on C®, - - - . Since G=c is a closed limited 
point set, these points must have a limit point P on G=c. If P is not a point 
at which the gradient of G vanishes, the curve G=c through P is a single 


* See for instance Kellogg, Foundations of Potential Theory, Berlin, 1929, pp. 273-277. 


16 J. L. WALSH AND H. G. RUSSELL [January 


analytic piece, as the theorem on implicit functions shows.* If P is a point 
at which the gradient vanishes, it is not a limit point of points at which the 
gradient vanishes, for such points can occur only on the boundary of K, as is 
evident from consideration of the derivative of the analytic function f(z) 
of which G is the real part.t If P is a point at which the gradient of G 
vanishes, the analytic arcs of which G=c consists in the neighborhood of P 
are finite in number and they pass through the point P with equally spaced 
tangentsf{, so P cannot be a limit point of points on C, C®, - - - . Hence 
we have reached a contradiction; and the statement that any locus G=c, 
c>0, consists of a finite number of curves is true. 

The locus G=c, c>0, consists either entirely of mutually exterior simple 
curves, or of mutually exterior curves some of which may be simple but at 
least one of which, C’, has a multiple point of order m,m=2; and C’ contains 
in its interior (i.e. the finite regions bounded by C’) at least m mutually ex- 
clusive closed sets belonging to M. The proof is similar to that already given 
and is left to the reader. 

If the region K is of connectivity greater than unity, there is at least one 
value of c for which the locus G=c contains a curve with a multiple point. 

The number of curves of which the locus G=c is composed increases 
monotonically (if at all) as c decreases. The locus G=c consists of a finite 
number of mutually exterior simple curves, except for a countable set of 
values of c. 

Out of Lemma I follows, as the reader will easily verify, 


Lemma II. Under the hypotheses of Lemma I, the point sets bounding the 
infinite region K can be approximated by finite sets of mutually exterior analytic 
curves G=c. More explicitly, the equipotential loci J™:G=c;, i=1, 2,---, 
C1>C2>c3> +--+ 0, lie in K and are such that the region interior to J+ is 
contained in the region interior to J, J and J“+ have no common points, 
and every point in K lies exterior to some J. If the c; are suitably chosen, each 
J consists of a finite number of mutually exterior analytic simple curves. 


3. Approximation to several analytic curves by a lemniscate. The locus of 
a point the product of whose distances from m fixed points is constant is a 
lemniscate. Thus, if the given points are ai, a2, - - - , @m, the lemniscate is de- 
fined by the equation |P(z) |=c, where P(z)=(z—a:) (g—az) - - - (s—am). 
For m=1, the lemniscate is a circle; for m=2, the lemniscate is a Cassinian 
oval. We note that |P(z) |=0 consists of the points z=a;, i=1, 2, - - - , m; 


* Osgood, Lehrbuch der Funktionentheorie, vol. 1, Leipzig, 1923, p. 675. 
t The proof follows that of Kellogg in the case that K is simply connected: loc. cit., pp. 364-365. 
t See for instance Kellogg, loc. cit., p. 275. 


1934] SEQUENCES OF POLYNOMIALS 17 


and, in the general case, since G=log [|P(z) |/|c|]”™ is Green’s function 
with pole at infinity for the region exterior to |P(z)|=c*0, the curves 
G=log ¢, e>1, or |P(z) | =ce™, for ¢ sufficiently near unity and c sufficiently 
near zero, are m simple closed analytic curves, each containing one root 
a;,i=1, 2,---, m, of P(z) =0, if the a; are all distinct. 

The possibility of approximation of analytic curves by lemniscates is the 
basis of our proof of Theorem I, and Theorem I is the source of all succeeding 
results in this paper. 


Lemma III. A finite number k of arbitrary mutually exterior closed analytic 
curves can be approximated by the same lemniscate; that is to say, given a set C 
consisting of k mutually exterior closed analytic curves C,,j=1,2,---,k, and 
a number n>O such that the n-neighborhoods of C, are distinct, a lemniscate 
T: |(g—a:) (—az) - - - (2—am) | =c exists which lies exterior to C and interior 
to these n-neighborhoods, and contains C in its interior.* 

Let C},j7=1, 2,---, , be & curves constructed as follows: 

(1) The curve C/ is contained in the region swept out by a circle of radius 7 
whose center describes C;. 

(2) The curve C/ contains in its interior one and only one of the given curves, 
say C,, and lies exterior to C. 

(3) The curves C} lie exterior to one another. 

Let s(§) measure arc length on the curves C; whose lengths are d,, 
j=1, 2,---, k. For Sd, shall lie on Ci; for di<s(€) Sdit+de, 
shall lie on C3; - ; for <s(f) shall lie on Ci. 

Green’s function G(x, y) existst for the region exterior to C: (1) G(x, ) 
is harmonic exterior to C except at infinity where G(x, y) =log r+Gy)(z, ¥), 
and G,(x, y) is harmonic at infinity and has the value —y at infinity, and 
(2) G(x, y) is continuous and vanishes on C. We define a function V(x, y) so 
that V(x, y) =G(x, y)+; and we now prove that there exists a continuous 


positive function 

1 dV(x, y) 

s)=— 
2r on 


(nm is the exterior normal for C) such that 


* This lemma was proved by Hilbert in the case of approximation to one analytic curve and 
applied to approximation of analytic functions by polynomials: Géttinger Nachrichten, 1897, pp. 
63-70. 

Simultaneous approximation of several distinct curves by lemniscates has also been used by 
other writers, especially Faber, Szegé, Fekete, and Pélya, in connection with the approximation of 
functions by polynomials and related topics, but without detailed proof of the results of the present 
paper. See particularly Faber, Miinchner Berichte, 1922, pp. 157-178, and for further references 
Pélya and Szegé, Crelle’s Journal, vol. 165 (1931), pp. 4-49. 

Tt Osgood, loc. cit., pp. 687-703. 


J. L. WALSH AND H. G. RUSSELL 


V(x, 9) = log ras, 


where now r=|z—{|, ds=|d{|, and z=x+iy is any point of the z-plane ex- 
terior to C. 

By a familiar theorem of potential theory, a function G(x, y) which is (1) 
harmonic in the region S which is bounded by C and a circle Cy (with center 
P:(x, y) exterior to C) containing C in its interior, and (2) continuous to- 
gether with its partial derivatives of the first order on the boundary of S, 
satisfies the following equation: 


G(x, y) 1 0G log ‘) 4 
xy=— og — 
2r/ c, on on 


(a) aG 


1 
— logr — —G 
(ver On 


Here r denotes distance from P:(x, y) and m denotes interior normal with 
respect to S. 
If we use the Green’s function G(x, y) =log r+Gi(x, y), we have from (a) 


1 0G 
(b) G(x, 9) = togr—as, 
Cc on 


for we have 


0G, 


1 
G(x, y) =— (10 —-G 


0G, 1 dlogr 
log r— ds = 0, =— Go ds = p. 
n 


Co Co on 


Consequently, 


1 
G(x, 7) +u= ds, 
2r c on 


V( ) : f ] = d. 
og r——ds. 
4 Cc on 


The function dV /dn is continuous on C since 0G/dn is continuous on C, and 
dV /dn is positive since V(x, y) is harmonic exterior to C except at infinity 
where it is logarithmically infinite. Hence 


18 ee [January 

- 

—— }ds + — og r— ds, 

on on 
and 


SEQUENCES OF POLYNOMIALS 


1 
an 
is the function desired. 

Since V(x, y) is harmonic exterior to C except at infinity, V(x, y) takes on 
all the C/ a minimum value We now choose >0 such that < —y)/2. 
Since V(x, y) —is Green’s function for the region exterior to C we may apply 
Lemma I. If V=y+ € is a locus consisting of curves not all of which are sim- 
ple, some curve of the locus must intersect a C/. Since u+¢<j1, and y is the 
minimum value of V on C}, the curve cannot cut C/. Hence V=y+e con- 
sists of a simple closed curve 7: in the ring C,C/, a simple closed curve 72 in 
the ring C2C?, - - - , a simple closed curve ¥; in the ring C,C;. 

By similar reasoning, V =, —€« consists of a simple closed curve y/ in the 
ring C,C/, a simple closed curve y? in the ring C2C7, - - - , a simple closed 
curve yz in the ring C.C;. 

We denote by y,7/ the ring region bounded by 7; and 7/. We let 


f $(s)ds ; 
0 


and we make the change of variable 


6(f) 
u(t) = f $(s)ds, 
0 


so that u increases from 0 to mas s increases from 0 to >-5_, d;. Then 


V(x, y) = f "log r du, 
0 


V(x, y) = lim (uo/n)(log 7: + log re + --- + logr,), 


where 7, 72, - - - , fn are distances from z to m points of C, corresponding to 
equidistant values = U2=2uo/n, - , Un Of u. For simplicity, the 
dependence of 7; on ” is not indicated in the notation. 

For z interior to 7,7; , convergence of the sequence of functions m log n, 
(uo/2) (log rir2), - - , (uo/N) (log rn), to V(x, y) is uniform.* 
For x sufficiently large, n2 N, and z interior to y,y/, we have 


< V(x, y) — (to/n)(log rire: 


* The detailed proof of uniformity offers no difficulty. See for instance Walsh, Bulletin of the 
American Mathematical Society, vol. 35 (1929), pp. 499-544; Lemma, p. 538. 


1934] ee 19 
and 


20 J. L. WALSH AND H. G. RUSSELL [January 


If we denote by I the locus exterior to C:(uo/N) (log nre - - - ry) =A, 
where u+e<A<yi—e, and choose e’ sufficiently small, we have, on I, 
ute<vA—e’ <V (x, y) <A+e’ Every Jordan arc joining a point of 7; 
to a point of y/ must cut I’. The locus I has the following properties: 


(1) T consists of a curve I enclosing 1, a curve I enclosing 2, - - -, 
and a curve I enclosing y;. Otherwise there would exist a region in some 
v;v; in which the harmonic function (%/N) (log riv2 - - - rv) constant on 
I’ would be identically constant, which is impossible. 


(2) T lies in the rings y,y/, since A is such that up+e<dA—e’< V(x, y) <A 
+e’ <mi—e. 
(3) T' is a lemniscate, for the equation 


eee = eNrAluo 


is of the form | P(z) |=c>0. 


The proof of Lemma III is now complete. 

If we choose 7 successively 1, 1/2, ---,1/m,-- +, we have lemniscates 
T,,T2,---+,Tn, +++, exterior to C. From this set can be extracted a subset 
such that (1) each T’;,4: is interior to T;,, (2) T';, and T';,41 have no common 
point, and (3) every point exterior to C lies exterior to some I’,,. 

4. Lemmas involving the mapping function w(z). We prove the following 
lemmas: 


Lemma IV. Under the hypotheses of Lemma I, a multiple point of order m 
of the curves G=c, c>0, occurs at (x, y) =(x’, y’) if and only if w'(z’) =0, 
w’’(z’) =0, (2’) 0, 2’ = (x’, y’), m=2, simultaneously, 
that is, when and only when z=2' is a branch point of the inverse of the mapping 
function w(z) = 

The proof of this lemma is essentially included in Lemma I. 


Lemma V. Let M, K, w(z) be defined as in Theorem 1. Let Cr denote* the 
transform in the z-plane of |w|=R, R>1, under the mapping function w(z). 
Let p be arbitrary, 1<p<R. Then there exists a lemniscate (of Lemma III) 
T:| (g—a,) - - - (s—an) |=c such that T contains M in its interior and such 
that - | =cR™/p™ lies interior to Ce. Thus for on 
and within T (hence on M) and for t on or exterior to T'rjp, we have 


— a1) (2 — Gm) 
(t— an) R™ 


* Asymbol of the form Cr denotes henceforth the transform in the z-plane of | w|=R, R>1, 
under the mapping function w(z). 


1934] SEQUENCES OF POLYNOMIALS 21 


Let I’ be a lemniscate contained in K and lying interior to C,. Then the 
locus I'z,, lies interior to [C,]r/.=Cr, as follows from the study of the Green’s 
functions for the exterior of C, and the exterior of I’. If these Green’s func- 
tions are denoted by G, and Gz respectively, their difference Gi:—G, is nega- 
tive on C,, hence harmonic and negative exterior to C, even at infinity. Then 
on I'x,, we have Gi:—log R/p <0, so on I'r;, we have Gi<log R/p; the curve 
Cr:G,=log R/p lies exterior to Tri. 


Lemna VI. Let M, K, w(z) be defined as in Theorem I. If Q(z) is a poly- 
nomial of degree n such that |Q(z)|<L, z on M, then 


| Q(z)| < LR#, z on and within Cr,, Ro > 1. 


The special case in which M is a line segment is due to S. Bernstein,* and 
the method used in proving Lemma VI is a generalization of the method of 
M. Riesz} for this special case. This lemma was proved by Walshf in case 
Kis simply connected and the possibility of its extension to the more general 
case was indicated by him. See also Faber (loc. cit.), who proves Lemma VI 
for a set M bounded by a finite number of Jordan curves. 

5. Approximation to an analytic function. We proceed now to the proof 
of Theorem I. 

We first prove (1). Since f(z) is analytic on and within Cr, the function 
}(z) is also analytic on and within some Cr’, R’>R. Choose the present R’as 
the R of Lemma V and the present ratio R’/R as the quantity p of Lemma V. 
Then by Lemma V there exists a lemniscate T':| (z—a:) - - - (g—am) | =c 
containing M in its interior, while T'z:| (z—a:) - - - (s—a») | =cR™ is interior 
to Cr’; for z on and within I (in particular on 47) and for ¢ on or exterior to 
(in particular on Cr’), we have 


(zg — a) --- — Gn) 1 


(¢ — an) R™ 


A unique polynomial P,,,_1 (z) of degree mp—1, p=1, 2, - - - , exists with 
the properties 


(i) (i) 
Prnp-1(a;) = f  (a;) (i=0,1,2,---,p—1; 7 =1,2,---,m).§ 


* Mémoires de l’Académie Royale de Belgique, Classe des Sciences, (2), vol. 4 (1912), pp. 36-94. 

t Acta Mathematica, vol. 40 (1916), pp. 337-347. 

t Miinchner Berichte, loc. cit., p. 225. 

§ Hilbert (loc. cit.) has exhibited such polynomials in the case of approximation in a simply 
connected region. See also Jacobi, Crelle’s Journal, vol. 53 (1856-57), pp. 103-126; and Montel, 
Lecons sur les Séries de Polynomes, Paris, 1910, pp. 47-49, 95-97. 


i 


22 J. L. WALSH AND H. G. RUSSELL [January 


Two distinct polynomials P,,,1(z) of degree mp—1 surely cannot satisfy 
these conditions, for their difference would have at least mp roots. We 
actually exhibit the polynomial P,,,:(z) (necessarily unique): 


= f(z) — — dt, zinterior toCpr:. 
= fie) —2L(t — (t — am) 


Indeed, it is clear by inspection that P,,,-1(z) thus defined satisfies the con- 
ditions on interpolation to f(z), since this equation is valid for z=a;. More- 
over, if f(z) is expressed by Cauchy’s integral (which may be taken over the 
whole of Cr even if Cr: consists of several curves): 
f(z) = dt, z interior to 
2ridCy,¢t — 2 
substitution in the previous equation leads to an integrand which has no 
singularity in 2 and which is a polynomial in z of degree mp—1, so the func- 
tion P»,-:(z) is seen to be a polynomial of degree mp —1. 
For z on M, we have 


| dt|. 


cy |t—2| l(t — a1) — an) 


Since f(z) is analytic on and within Cp-, there follow the inequalities | f(#) | 
<N"; - - - (¢—am)|/ | (t—ai) - - - (t—an) | $1/R™, and 1/|t—2| 
<1/6,zon M,tonCp-. Set 
f | dt] =L; 


we have 


where N’ is independent of # and z. 
The polynomial P,(z) of degree n, n=1, 2, - - - , already defined when » 
is of the form mp —1, is now defined for arbitrary m by the equation 


P,(2) = Pmp-(z), 


Then we have the inequality 


| — P 
N’ 
—> zonM, 


SEQUENCES OF POLYNOMIALS 


N’ N 
where N = N’R”-*, and where N is independent of m and z. The proof of the 
first part of the theorem is complete. 

The proof of the second part of the theorem is the analog of the cor- 
responding proof given by Walsh in the case of a closed limited point set 
whose complement is simply connected,* and is a direct application of Lem- 
ma VI. 

The following theorem is simpler but less explicit than Theorem I: 


THEOREM. Let M be an arbitrary closed limited point set whose complement 
K is connected and possesses a Green’s function with pole at infinity. A necessary 
and sufficient condition that f(z) be analytic on M is that there exist polynomials 
P,(2) of respective degrees n such that the inequality 


N 
(a) [f(@) — Pa2)| R>1, 


N not dependent on n or 2, is valid for every z on M. 


The function f(z) of Theorem I is not necessarily a monogenic analytic 
function; in other words, if we consider the functions defined on various 
separated pieces of M, the hypotheses of the theorem may well be satisfied 
where f(z) is not a monogenic analytic function. 

Theorem II shows the best degree of approximation (measured like the 
convergence of a geometric series) possible for a sequence of polynomials 
{P,(z)}: 

THEOREM II. Let M, K, w(z) be defined as in Theorem I, and let f(z) be 
analytic on M. Let R be the largest number for which the following is true: (1) 
a function F(z) is analytic and single-valued interior to Cr, (2) F(z) =f(z) on M. 
Then there exists a sequence of polynomials {P,(z)} of respective degrees n, 
such that 


N 
| f(z) - P,(z) | s ry zon M, Ry arbitrary < R, 


N dependent on Ry but not on n or 2; but for no sequence of polynomials {P,,(z) } 
do we have 


| fe) — Pa(z)| < son M,R,>R, 
1 


N dependent on R, but not on n or 2. 
* Miinchner Berichte, loc. cit., p. 226. 


1934] 23 


24 J. L. WALSH AND H. G. RUSSELL [January 


The number R defined by (1) and (2), finite or infinite, exists; the formal 
proof is left to the reader. 

The curve Cpr is characterized by the fact that the function f(z) (when 
suitably extended analytically from M along paths interior to Cg) is analytic 
and single-valued interior to Cr, but is not analytic or is not single-valued or 
fails in both particulars interior to every Cr, R’>R, when extended from M 
along paths interior to Ce’. Thus, (a) at some point P of Ce the function 
f(2) has a singularity for analytic extensions from M along paths interior to 
Cr terminating in P; or (b) the curve Cp has at least one multiple point Q, 
and there is disagreement at Q among the various analytic extensions of f(z) 
from the various parts of M to Q along paths belonging to the several regions 
interior to and bounded by Cp; or (c) both (a) and (b) occur. 

As an illustration of (b) let the point set M be the closed interior of the 
lemniscate | z2—1|=c, ¢c<1, and let f(z) =1 in the oval to the right of the or- 
igin, and f(z) = —1 in the oval to the left of the origin. Then Cz is the lem- 
niscate |z?—1| =1. As an illustration of (c) let f(z)=1/(z—2"?) and 
1/(z+2"?) in the right and left ovals of the point set M above. Then Cz is 
again the lemniscate with double point | z?—1|=1. 

The first statement of Theorem II has been proved in Theorem I, al- 
though the polynomials there exhibited depend on Ro; this restriction does 
not appear for the polynomials of Theorem III below. We shall prove the 
second statement. 

Assume that polynomials P,,(z) of degree nm, m=1, 2, - - - , exist such that 


N 
| f(2) — Pa(s)| S son M, Ri > R, 
1 


N independent of m and z. By Theorem I, the sequence {P,(z)} converges 
to an analytic function F(z) within Cr,. Then F(z) is analytic interior to Cr, 
and F(z) =f(z) on M, where R; is greater than R, contrary to hypothesis. 

6. The Tchebycheff polynomial. The Tchebycheff polynomial of degree 
n for approximation to f(z) on M is the polynomial II,,(z) of degree m such that 


max | f(z) — II,(z) |, zon M, 


is not greater than the corresponding expression for any other polynomial of 
degree n. The Tchebycheff polynomial exists and is unique,* under the hy- 
potheses of Theorem I. 

Theorem III states the exact region of uniform convergence of sequences 
of polynomials of best approximation in the sense of Tchebycheff. The first 


* Tonelli, Annali di Matematica, vol. 15 (1908), pp. 47-119. 


1934] SEQUENCES OF POLYNOMIALS 25 


part of this theorem was proved by Faber* for a point set M bounded by an 
analytic Jordan curve and the entire theorem was proved by Walshf in the 
case of a closed limited point set whose complement is simply connected. 


THeorEM III. Under the hypotheses of Theorem II, the sequence of poly- 
nomials {11,(2)} of respective degrees n,n=1, 2, - - - , of best approximation in 
the sense of Tchebycheff to f(z) on M converges interior to Cr, uniformly on any 
closed point set interior to Cr, and converges uniformly in no region containing 
a point of Cr in its interior. 


The proof of this theorem is the analog of that given by Walsh in the case 
of a closed limited point set whose complement is simply connected, and is 
omitted. 

The proof of Theorem III holds for the following theorem: 


Any other sequence of polynomials which converges on M like the Tchebycheff 
polynomials, or, in other words, such that the inequality | f(z) —P,(z)| <N/R¢ 
is satisfied for 2 on M and for Ro arbitrary less than R, where N depends on Ro 
but not on 2, converges as in Theorem III. 


The following theorem was proved by Walshf in the special case of a 
point set whose complement is simply connected: 


Under the hypotheses of Theorem III, neither the sequence of polynomials 
{I1,(z) } of best approximation to f(z) on M nor any other sequence of polynomials 
which converges like the sequence of polynomials of best approximation converges 
like a geometric series in any region or on any Jordan arc exterior to Cr. 


The proof follows the method of proof of the last part of Theorem ITI. 

In particular, the discussion holds for simultaneous approximation to real 
analytic functions on a finite number of intervals on the axis of reals. 

7. Other measures of approximation. There are other measures of approx- 
imation such as (1) approximation by the Tchebycheff method with a 
norm function, (2) approximation on M as measured in the sense of least 
pth powers (p>0) by line integrals in the case that M (closed, limited) is 
bounded by a finite number of rectifiable Jordan curves, (3) approximation 
on M as measured in the sense of least pth powers (p >0) by surface integrals 
where M (closed, limited, consisting of a finite number of regions) is an open 
set plus its boundary points. 

In each of these cases the polynomial of best approximation exists, and is 


* Crelle’s Journal, vol. 150 (1920), pp. 79-106. 
T (1), p. 795; (2), pp. 381-384. 
t (2), p. 385. 


| 
| 

t 

4 

4 


26 J. L. WALSH AND H. G. RUSSELL [January 


unique if p>1.* In each case, as we shall proceed to indicate, under suitable 
restrictions on M, the sequence of polynomials of best approximation to f(z) 
on M converges satisfying the inequality | f(z)—P,(z)|<N/Ro, Ro<R, 2 
on M, and hence converges interior to Cr, uniformly on any closed set in- 
terior to Cz, but converges uniformly in no region containing a point of Ce 
in its interior. 

The proofs in each of these cases are analogous to proofs already given 
by Walsh.} In cases (2) and (3), an inequality of form | f(z) —P,(z)|<N/R¢, 
Ro <R, is first proved not for zon M but for z on a suitably chosen closed set 
M’ interior to M. The conclusion follows from the fact that when M’ ap- 
proaches M, then Cy’ (defined for M’ as Cp is defined for M) approaches Cr; 
this latter fact is a consequence of the fundamental results of Lebesguef on 
harmonic functions and variable domains. 

A Tchebycheff polynomial for approximation to f(z) on M with the norm 
function p(z), where p(z) is given continuous and different from zero on M, 
is the unique polynomial II,’ (z) of degree m such that 


max [| p(z)| | f() — ], son M, 


is not greater than the corresponding expression for any other polynomial of 
degree n. 


THEOREM IV. Under the hypotheses of Theorem II, the sequence of Tcheby- 
cheff polynomials {II (z)} for approximation to f(z) on M with an arbitrary 
positive continuous norm function p(z) converges interior to Cr, uniformly on an 
arbitrary closed point set interior to Cr, and converges uniformly in no region 
containing a point of Cr in its interior. 

A polynomial of best approximation in the sense of least weighted pth 
powers as measured on >_T;, whereT’;, 7 =1, 2, - - - , are k rectifiable Jordan 
curves bounding the point set M (satisfying the hypotheses of Theorem I), 
is a polynomial II,(z) of degree m such that 


| — Walz) |? n(2)dz, 


j=l VT; 


where p>0 and n(z) is arbitrary, continuous, positive, is not greater than the 
corresponding expression formed for any other polynomial of degree n. 


THEOREM V. Let the closed limited point set M whose complement is con- 
nected be bounded by a finite number k of non-intersecting rectifiable Jordan 


* See for instance Walsh, these Transactions, vol. 33 (1931), pp. 668-689; p. 681. 


(1); (2). 
t Palermo Rendiconti, vol. 24 (1907), pp. 371-402. 


1934] SEQUENCES OF POLYNOMIALS 27 


curves T';. Under the hypotheses of Theorem II, the sequence of polynomials 
{I1.(z)} of best approximation to f(z) on M in the sense of least weighted pth 
powers (p>0) as measured on T; converges throughout the interior of Cr, 
uniformly on any closed point set interior to Cr, and converges uniformly in no 
region containing a point of Cp in its interior. 

The case p=2 is of especial interest. Here the polynomial II,(z) of best 
approximation to an arbitrary function f(z) is of the form 


II,,(z) aoPo(z) + a,P,(z) + + a,P,(z), 


where the P;(z),7=1, 2, - - - ,m, depend on the I’; but not on f(z), and the co- 
efficients a;(i <m) are independent of . The set of polynomials P;(z) is said to 
belong to the point set M. 

The method of approximation used in Theorem V for p=2, m(z)=1, was 
discussed and the corresponding special case of Theorem V was proved (under 
an additional restriction) by Szegé* and Smirnoff{ for the case of a point set 
whose complement is simply connected. 

A polynomial of best approximation in the sense of least weighted pth 
powers as measured by integration over the areas R;, where R;,j=1, 2, -- -, 
k, are arbitrary closed regions, is a polynomial II,,(z) of degree n,n =1,2,---, 
such that 


k 


where p>0, (z) is continuous and positive in R;, is not greater than the cor- 
responding expression} formed for any other polynomial of degree n. 


TueEoreM VI. Let R;, j=1, 2,--*, k, be arbitrary closed limited regions 
no two of which have a common point. Let K denote the region consisting of all 
points which can be connected with the point at infinity by Jordan arcs which do 
not contain points of the R;. Let G(x, y) be Green’s function with pole at infinity 
for K. Under the hypotheses of Theorem II, the sequence {11,(z)} of polynomials 
of best approximation to f(z) in the sense of least weighted pth powers, p>O, over 
the areas R;,j =1, 2, - - - , k, converges interior to Cr, uniformly on any closed 
point set interior to Cr, and converges uniformly in no region containing a point 
of Cr in its interior. 


It will be noticed that the regions R; are not necessarily Jordan regions, 
and in fact any region R; may be multiply connected and even if simply con- 


* Mathematische Zeitschrift, vol. 9 (1921), pp. 218-270. 

t Journal de la Société Physico-Mathématique de Léningrade, vol. 2 (1928), pp. 155-178. 

t If any of the boundaries of R;, 7=1, 2, - - - , &, have area, either upper or lower integral may 
be used here. 


5 

4 


28 J. L. WALSH AND H. G. RUSSELL 


nected may separate various regions B from K. The hypothesis of Theorem 
VI includes the analyticity of f(z) in all such regions B. 

The case p=2 is again of especial interest. The polynomial II,(z) of best 
approximation to an arbitrary function f(z) is of the form 


II,(z) = aoPo(z) + aiPi(z) +--+ + anP,(2), 


where the P;(z),i=1, 2, - - - ,, depend on the R; but not on f(z), and the co- 
efficients a;(i Sm) are independent of n. The set of polynomials P,(z) is said to 
belong to the point set M. 

The method of approximation used in Theorem VI was considered by 
Berginann,* Bochner,{ and Carlemant in the case of a single Jordan region, 
p =2, n(z) =1, although without proof of our results on degree of convergence 
and overconvergence. 

As a complement to Theorems IV—VI we add 


THEOREM VII. Jf M consists of a finite number of mutually exclusive closed 
Jordan regions and if the function f(z) is analytic in the interior points of M, 
continuous in the corresponding closed regions, then (1) the sequence of poly- 
nomials of best approximation to f(z) on M in the sense of Tchebycheff with a 
positive continuous norm function converges to f(z) uniformly on M; (2) if the 
Jordan curves bounding M are rectifiable, the sequence of polynomials of best 
approximation to f(z) on M in the sense of least pth powers (p>0) as measured 


by a line integral with a positive continuous norm function converges to f(z) at 
every interior point of M, uniformly on any closed set interior to M; (3) the 
sequence of polynomials of best approximation to f(z) on M in the sense of least 
pth powers (p>0) as measured by a surface integral with a positive continuous 
norm function converges to f(z) at every interior point of M, uniformly on any 
closed set interior to M. . 


In case (3) it is indeed sufficient (see Carleman, loc. cit.) for this conclu- 
sion if f(z) is analytic interior to M and if {fx | f(z) |? dS exists; the restrictions 
in case (2) can similarly be somewhat lightened (see Smirnoff, loc. cit.). 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 

WELLESLEY COLLEGE, 
WELLESLEY, Mass. 


* Mathematische Annalen, vol. 86 (1922), pp. 238-271. 
+ Mathematische Zeitschrift, vol. 14 (1922), pp. 180-207. 
t Arkiv for Matematik, Astronomi, och Fysik, vol. 17 (1922-23). 


THE FIRST AND SECOND VARIATIONS OF AN 
n-TUPLE INTEGRAL IN THE CASE OF 
VARIABLE LIMITS* 


BY 
H. A. SIMMONS 


1. Introduction. The main purpose of this paper is to generalize except in 
one detail all of the results which we obtained relative to a double integral 
in a previous article.} The first eight sections here would constitute a com- 
plete generalization of those results if the integrand of the (m—1)-tuple in- 
tegral in equation (6.9) below could be expressed as a linear homogeneous 
function of ¢ and not involve the ¢., (cf. equation (49) of the previous 
article). We have not been able to do this. A special device that was used in 
obtaining equation (49) of the previous article does not seem capable of gen- 
eralization here. In §9 we suggest for the fundamental formulas in (4.8), for 
the integrals J’(0) and J’’(0), certain applications that are not made in §§1-8. 

In view of certain papers of Lichtensteinf and Reid§, in which Jacobi’s 
condition is stated in terms of characteristic numbers of boundary value 
problems somewhat like the problem of §7 below, and also because of recent 
advances in the theory of elliptic partial differential equations|| our gener- 
alization seems desirable. 

The theses of Bates{] and Powell** are useful in studying curvilinear co- 
ordinate systems of the type that we employ here. We take the lines of curva- 
ture as the parameter lines. 

The legitimacy of the use that we make of an extended Green’s theorem 
is well known.ff 

In this paper the variables x, - - - , x, are the coordinates of a euclidean 
space X of m (n=2) dimensions in the euclidean space XZ of (+1) dimen- 
sions of coordinates , z. An equation z=2(x) or =0, where 


* Presented to the Society, December 1, 1933; received by the editors March 29, 1933. 

t These Transactions, April, 1926, p. 235. 

t Monatshefte fiir Mathematik und Physik, vol. 28 (1917), p. 3; Mathematische Zeitschrift, 
vol. 5 (1919), p. 26. 

§ American Journal of Mathematics, vol. 54 (1932), p. 791. 

|| Cf. bibliography at the close of Raab’s thesis, Jacobi’s condition..., The University of 
Chicago Press. 

| These Transactions, vol. 12 (1911), p. 19. 

** The University of Chicago Press. 

tt Cf., for example, Franklin, Annals of Mathematics, 1923, p. 213. 


29 


30 H. A. SIMMONS , [January 


x=(x1, ---+,%n), defines an m-dimensional hypersurface in XZ. We let 
pi=02/dx; (t=1,---+, m) and let W stand for a (2m+1)-dimensional open 
region in the space XZP of the variables x, z, and p =(fi, - - - , pn). Then we 
define an admissible hypersurface 2 =2(x) to be one with its elements in W and 
having the following four properties: (i) z is a single-valued function of the 
x’s; (ii) z is of class C’’; (iii) it has a real, simply closed (n—1)-dimensional 
intersection (that is, a connected (7—1)-dimensional intersection that is 
bounded, closed, and does not intersect itself) Lj, with a fixed hypersurface 
¢(x, z) =0, which is of class C’’ and has no singular point for x and z in W; 
(iv) it is such that the projection, Lo, of Li’ on X is met by a line parallel to 
any one of the coordinate axes, «;, in a finite number of points and segments. 

Property (iii) indicates the sense in which we use the term variable limits. 
The manifold Lj is the boundary of the portion of the admissible hypersur- 
face z =2(x) that we consider. On account of property (i), the correspondence 
between points of Zo and L¢ is one-to-one; and so Lp is also a simply closed 
(m—1)-dimensional manifold (cf. (iii)). It bounds a simply connected portion 
of X space. This we call So. 

Property (iv) is required to insure that our application to Ly and S» of an 
extended Green’s theorem in §§5, 7 below shall be legitimate. 

We consider here the n-tuple integral (m2 2) 


(1. 1) I= “fs p)dx, 
So 


where x, z, p, So are as defined above, and f is of class C’”’ in W. This integral 
I is our generalization of the double integral 


Ao 
of the previous article. 


Assuming that z=2(x) is a minimizing admissible hypersurface for the 
integral J, we let {(x) be any function of the x’s with properties (i), (ii), and 
such that if a is a real parameter sufficiently small numerically, then z= 2(x) 
+at(x) is admissible. Let L/ denote the (w—1)-dimensional manifold 


2(x) + af(x)) = 0, 2 = 2(x) + af(x) 


when «x is sufficiently near the x’s that determine points of S)*; let L, denote 
the projection on X space of L ; and designate by S, the hyperarea in the X 
space that is bounded by L,. Then in place of the integral (1.1), we have I(a): 


* So near that the correspondence between points x and points uv in §2 is (1,1), reversibie. 


| 


1934] VARIATIONS OF n-TUPLE LIMITS 31 


(1.2) (0) = "fx, at, + 
Sq 


Our main problem is to obtain the first and second derivatives J’(0) and 
I’’(0), which are analogous to corresponding integrals of the previous article. 
We assume that f~0 on the hypersurface z=2(x) along its intersection L/ 
with the fixed hypersurface $(x, z) =0 for a reason analogous to that which 
made a similar assumption desirable in the previous article. 

In §2, we set up a normal curvilinear coordinate system which plays an 
important réle in later sections of this paper; in §3, the equations of Rod- 
riguez* are generalized and the result is used to obtain a simple expansion 
of a functional determinant of §2; in §4, Theorem 1 of the previous article is 
generalized to the case of an n-tuple integral; in §5, the results relative to the 
first variation in the previous article are generalized; in §6, two expressions 
for I’’(0) are given; in §§7 and 8, the boundary value problem and the dis- 
cussion of the minimal surface, respectively, of the previous article are 
generalized; and the object of §9 is as we stated above. 

We wish to thank Professor L. P. Eisenhart for numerous suggestions that 
he has given relative to §§1, 2 of this paper. 

2. The normal coordinate system.f Let Lo be a simply closed (m—1)- 
dimensional manifold with equations 


x; = 
where u is (n—1)-partite and the é’s are defined for all real values of the w’s, 
are of class C’’, have 


0 (j= 1,--+,#—12), 


where the subscript u, indicates partial differentiation of the £; with respect 
to «, (similar subscript notation is used throughout the sequel) and each £; 
has in u, a period, say ¢,, which is passed through once (exactly) when a 
point « passes once around the u,-curve on Lo. We agree further, as stated 
above, to take the lines of curvature on Ly as our parameter lines, so that the 
u-curves on Ly are mutually orthogonal.f 

We now introduce near Ly a uv-coordinate system determined by the 
equations 


(2.1) = E(u) + 0A, 1 S058 


* Cf. Eisenhart’s Differential Geometry, p. 122. 
+ Some of the ideas of this section are also expressed in §6 of Powell’s thesis, which was referred 
to in §1. 
t Cf., for example, Bates, loc. cit., p. 25, Theorem 1. 


32 H. A. SIMMONS 


where 


* » Gent, °°* » fn) 
Un—1) 


(2.2) A; = (— 1) 


jx; A?=1, a condition that can be realized by a suitable choice of param- 
eters; v is one-partite; and v, <0, 72 >0 are sufficiently small numerically that 
there exist unique functions 


uj; = U,(x1, Xn), V(x, Xn) 
of class C’ satisfying equations (2.1). This is possible* since for x on Lo, 
(u, v) =(u, 0) and the functional determinant 
+ VA tu, | + VA nu; 


Ay An 


= (- 1)" = (-—1)""' 


» 


We call our coordinate system normal because the (m—1) lines of curva- 
ture u; are mutually orthogonal and, at every point P of Lo, v measures along 
the unique normal in the X space to Ly through P for all values of m=2. 

After obtaining, in §3, a simple expansion of the determinant A(v), we 
shall employ, in §4, the coordinate system of this section in differentiating an 
n-tuple integral with respect to a parameter. 

3. Use of generalized equations of Rodriguezt in expanding A(v). The gen- 
eralization of the equations of Rodriguez may be obtained from the first set 
of equations that Bates displays on page 24 of his article referred to above. 
In these equations we let be the (7 —1), number of w’s, of the present paper, 
and we take his x,, {¥ as our u,, Aiu,, respectively. Since we are taking the 
lines u, to be lines of curvature (mutually orthogonal), we thus obtain the 
following generalized equations of Rodriguez: 


(3.1) in; = Siu; 


where p, is, except possibly for sign, the radius of curvature of the u,-line of 


* Cf. G. A. Bliss, Princeton Colioquium Lectures, p. 20. 
t Cf. Eisenhart’s Differential Geometry, p. 122. 


[January 

Ai, 

(i= 1,---,m), 


1934] VARIATIONS OF n-TUPLE LIMITS 33 


curvature. As a consequence of (3.1), the determinant A(z) in (2.3) may be 
written 


€1u,(1 + v/p1) Enu, (1 + v/p1) 


+ v/pn-1); Sin + v/pn—1) 
A, Ay 


= (1+ (1+ DAB 


or 
(3.2) = + (1+ mot: + 


where 7, is the elementary symmetric function of ‘th order of the p;* 
(j=1,---,m—1). 

4. The derivatives of an m-tuple integra] with respect to a parameter. 
Consider a family of (#—1)-dimensional manifolds, one of which, L, (cf. the 
equation of Lo in §2), is given by the equations 


(4. 1) x= §; + v(u, a)A; 


where 0(u, a) is defined and of class C’’ for all (wu, a) having each u, as it was 
defined in §2 and a sufficiently near zero (v:<0(u, @) Sv2), where v(u, a) has 
for each u, a period T,(a) that reduces to T,(0) =¢, for a=0; and o(u, 0) =0. 
All of the manifolds L, are closed on account of this periodicity and each L, 
is also simply closed for all values of a sufficiently near zero since Lp is simply 
closed. We let S, denote the hyperarea bounded by L, with the special under- 
standing that when a satisfies the equation v(u, a) =, we are to designate 
S, and L, as S; and Ly, respectively. 

Let g(x, a) be a function of a and the x;(i=1, - - - , 2) whichis of class C’’ 
for all sets (x, a) having x in a sufficiently small neighborhood of the hyper- 
area S» bounded by Ly and having a such that 2: S0(u, a) Sv. Define J(a) 
by the formula 


(4.2) J(a) = f g(x, a)dx. 

8, 
We desire the derivatives J’(0) and J’’(0). To obtain them, we first express 
the integral (4.2) as a sum of two integrals: 


* The relative simplicity of this expansion may be observed by comparing it with the one 
that results if A(v), in (2.3), is expanded by minors in the notation that Bates used in a similar expan- 
sion (cf. Bates, loc. cit., equation (34)). 


| (i= 1,---,#), 


34 H. A. SIMMONS 


n 


(4.3) J(a) = f g(x, a)dx + g(x, a)dx, 

81 48 
where S; is as it was defined in §2 and AS is the hyperarea in the X space that 
is bounded by the (7 —1)-dimensional manifolds Z; and L,. The derivative of 
the first integral in (4.3) has the value 


(4.4) a)dx. 
s 


1 


To differentiate the last integral in (4.3), we first transform it to the w- 
coordinate system by means of (2.1). Letting A(v)* stand for the value of A(v) 
when the + sign is used before the parenthesis in (3.2), we find 


(4.5) f = f | f + vA, | du, 
4S Lo 


1 


where £+2A stands for the m expressions £:+7A1, - - - , Since oc- 
curs only in the upper limit of the inner integral of (4.5) and explicitly in g, 
we find the derivative in question to be 


n—1 n—1 v(u, a) 
f gvgA(v)*+du + | f du, 
Lo Lo 1 


where in the first integral » = v(u, a). Adding this result to the expression (4.4), 
we obtain 


n n—1 v(u,a) 
J'(a) = gadx + | f gad) du 
8; v 


Lo 1 


n—1 
+ vag(t + 0A, a)A(v)+du. 


Lo 


Hence, after transforming the second integral to x-coordinates, we find 


n—1 


(4.6) J'(a) = + vag(é + vA, a)A(v)tdu. 


‘a Lo 


From the above procedure (perhaps with reference to the previous ar- 
ticle), we now find without difficulty that 


n—1 


(4.7) 7a) = f + + )A(0)* + 


8 Le 


Putting a=0 in (4.6) and (4.7) and recalling that v(u, 0)=0, we find the de- 
sired results, which we express as follows. 


[January 


1934] VARIATIONS OF n-TUPLE LIMITS 35 


THEOREM 4.1. The derivatives J'(0) and J’’(0) of the n-tuple integral 
J(a) defined by (4.2), taken over the n-dimensional region S, bounded by the 
manifold L,, defined by equations (4.1), have the values 


n a—1 


J'(0) = gadx + gv,du, 

So Lo 

n n—1 

J’'(0) = Saadx + (gdaa + migv? + 2gada + gov? )du.* 

So Lo 

The derivatives (4.8) have been computed for the family of variations 

(4.1) of Lo. However we can obtain from (4.8) analogous formulas for a more 
general family of variations of Lo of the form 
(4.9) x= a) 
where is (n—1)-partite and 7,;=7,(m, @)(j=1, -- - ,m—1), with 
Un+,0)=u;. We suppose that (4.9) represents a one-parameter 
family of simply closed (m—1)-dimensional manifolds containing Lo for 
a=0. The functions X; are supposed to be of class C’’ for all values of (r, a) 
having each 1, real and a sufficiently near zero. They have a period I',(a) for 
every a that we consider, with I',(0) =¢;. Such a family is representable in the 
form (4.1) if we can solve the equations 


(4.10) Xi(r, a) — &(u) — vA,(u) = 0 


(4.8) 


for v and the 7, as functions of a and the u,. According to the implicit function 
theorem used in §2, this can be done since the equations (4.10) have the 
particular solution (v, 7, u, a) =(0, u, u, 0) for O<u,;<t,, on which the func- 
tional determinant 


= —A(0) = 


Hence we can obtain v, and ?. for the general family (4.9). Differentiating 
(4.10) once, twice, and agreeing that a term in which 7 appears as a repeated 
index (even though it be a subscript of a subscript) is to be summed for all 
integral values of 7 from 1 to (w—1), and setting a=0, we obtain (4.11), 
(4.12), respectively, 


(4.11) X iz ;Tia Xia = 0, 
(4. 12) X ;Tiaa A Waa + X + 2X ar jTia + X TjaTka = 0. 


* In the second of equations (9) of the previous article there is a misprint. The last term of the 
integrand of the line integral of J’’(0) there should be g,v2 . 


36 H. A. SIMMONS , [January 


The determinant of the m equations (4.11) in the rj. and vq, like that of the » 
equations (4.12) in the rjas and Va, is —A(0) = #1. After a sense is assigned to 
Lo, so as to give A(0) a definite value (cf. (2.2)), say +1, equations (4.11) 
define v, as a polynomial in the X;. and the X;,;,(=&; on Lo), while (4.12) 
define va. similarly in the X ias, X iar;, X irjryy Xirj, aNd Tja. But the in Vas 
can be eliminated by means of (4.11); the 7;, are then polynomials in the 
Xia, Xiz; since the A; are polynomials in the X;,, (cf. the functional deter- 
minant last displayed above). Hence we have the following corollary of The- 
orem 4.1. 


Coro.uary. The derivatives J'(0) and J'’(0) in equations (4.8) can be gen- 
eralized to the case where (4.1) is replaced by (4.9). When this is done, v. is a 
polynomial in the Xia and the X;,;, while Vac is a polynomial in the X ia, Xiz;, 
X tea; X X 

5. The first variation. In the sequel, any term that contains a repeated 
index other than a, v, 2, however it may appear, is to be summed for all integral 
values of the index from 1 to m. Thus we write f,,(¢:;+:¢.) for the sum of the 
nm terms that one obtains from this expression by taking i=1,---, m. 
Further, when we use Kronecker 6’s with 5,*=1 or 0 according as k=i or 
ki, respectively, we extend customary convention by admitting subscripts 
of subscripts as summation indices; thus we would write £.,5;* =¢,,. 

Proceeding now as we did in §3 of the previous article, we find without 
difficulty the following equations, of which we number only those that are to 
be referred to later: 


n—1 


= + [ fa = fet + 
So 


Lo 
+ vA, + 0A) + + vA)) = 0, 
£+(u, a)A being as in (4.5), so that ¢ contains the variables u, v, a; 
(5. Va = — a/v ¥ 0, cf. (5.8)) ; 
(S. = O85, “bo = (2; + pids)Ai; 
(S. — + 
nl — 

Ly (bz; + 


(5.4) 10) = (fat + Sot eddx + 
So 


Hence we have the following theor2m: 
TuHeoreM 5.1. The first derivative I'(0) of the n-tuple integral (1.1), taken 


over the portion of the hypersurface z=2(x)+af(x) bounded by its intersection 
with the hypersurface (x, z) =0, has the value given by (5.4). 


1934] VARIATIONS OF n-TUPLE LIMITS 37 


From the point of view of the calculus of variations it is desirable to per- 
form an integration by parts on the terms /,,{., of (5.4). Since 


) 


we can replace the m-tuple integral in (5.4) by 


E (v. + = ind | dx 


Using this result in (5.4), we obtain 


Lo 
Since, along $(£:(u), - - - , n(u), 2(&, - - - , =O in the w,, we can re- 
place (5.5) by an equivalent equation analogous to equation (32) of the 
previous article. Differentiating this identity, ¢=0, with respect to each u,, 
we obtain the (n—1) equations 


(5.6) (2; + = 0 Gj =1,---,#—1). 
Hence with (5.6) and the last equation in (5.2), we have equations which 
determine the 

(5.7) + pids = Aids (cf. (2.2)), 


where 
(5.8) = + pid)? ~ 0. 
i=—1 


That ¢,?0 may be proved as follows. Suppose ¢,?=0, so that 
oz; + 0 

Then the hypersurfaces z=2(x) and ¢(x, z)=0 are tangent to each other, 
which is impossible when f #0 along Ly , as we shall see just below Corollary 
5.2. In view of (5.7), (5.5) is equivalent to 
a + — fbs|du 

fe ) dx + 
Ox; Io, Lo + 


* In obtaining this term we have used the extended Green’s theorem referred to above with the 
A; as direction cosines of the outer normal to Lo. 


(5.9) ro = f 


8 


38 H. A. SIMMONS [January 


The Euler necessary condition for a minimum value of J in the case of 
fixed limits (where the hypersurface $(x, z)=0 is replaced by a bounded, 
closed, connected (m —1)-dimensional manifold, such as Lo) is 


(5.10) 
Ox; 

at every point of So. This is surely a necessary condition for the case of var- 
iable limits. Since J’(0) =0 is a necessary condition for a minimum value of J, 
it now follows that if z=2(x) minimizes J the second integral in (5.9) vanishes, 
and indeed that the numerator, N, of the integrand of this integral is zero at 
every point of Lo, 0<u,<t,, as we presently prove. Suppose N does not so 
vanish. Either N has one sign on the entire manifold Z» or there is a nen- 
zero (n—1)-dimensional subregion of Z» on which W has one sign. We may 
take {(u, - - - , Un-1) to be of the same sign as N on one such subregion and 
zero elsewhere. Then since ¢,~0, J’(0) #0 (contradiction). Hence we have 
the transversality condition 


(5.11) fo + pide) So: = 0 


at every point of Lo. 
We now have the following corollaries of Theorem 5.1. 


Coroiiary 5.1. The first derivative of the n-tuple integral I(a), of (1.2), 
taken over the portion of the hypersurface 2=2(x)+<a{(x) bounded by its inter- 
section with the fixed hypersurface (x, 2) =0, has the value given by (5.9). 


Coro.iary 5.2. In case z=2(x) is a minimizing hypersurface for the n-tuple 
integral (1.1), the Euler equation (5.10) must hold at every point of the portion 
of the hypersurface z=2(x) inside Lj, and the transversality condition (5.11) 
must hold at every point of the boundary, L¢ , which is the manifold of intersec- 
tion of the hypersurfaces z=2(x) and $(x, z) =0. 

Since we have assumed in §1 that {0 along Lj, it follows from (5.11) 
that the hypersurface z = z(x) is not tangent to the hypersurface $(x, 2) =0 at 
any point of L/. In the case of the minimal hypersurface for which f=(1 
(5.11) reduces to =0, which shows that the hypersurfaces 
z=2(x) and ¢(x, z) =0 meet at right angles. 

6. The second variation. To get J’’(0), we apply to (1.2) the result in the 
second of equations (4.8). Replacing g in that equation by f, we obtain 


1 
M du, 


(6.1) 10) = + 


So Lo 


* Cf. page 5 of Powell’s thesis, loc. cit. 


1934] VARIATIONS OF n-TUPLE LIMITS 


where and 
(6.2) fas = + + Soin 2;- 


By differentiating the equation ¢,7.+¢,.=0 (cf. (5.1)), with respect to a, 
we find, as in the previous article, that 


1 
(6.3) — + 2bara + $aa); 


where ¢, is defined in (5.2), and @vv, dav, Pac, at a=0, are obtained by dif- 
ferentiating ¢ as a function of the arguments 
E+ 0A; (t=1,---,m), + 0A) + + 2A), 
§+vA standing for the set of m expressions §;+vA;. We find, for a=0, 
dov = (b2;2; + + + = 2252; 3 
Pav = + baa = bez $?. 
Using these derivatives together with (5.2) and (5.3), we find (cf. (6.3)) 


1 
2¢.£ + Pibes) ($2; + Pid:) 


Hence if we collect in (6.4) the terms involving the second derivatives of ¢, 


those involving the second derivatives of z, and those free of second deriva- 
tives, we find 


(6.4) 


Vea = ($2 ¢2;2; 2b 242 + iA; — 
(6.5) 


2; 
+ (22; + o2 


Now using the notation 


Gi = — b2;/b2, A= — (bi — 98) 
$2 26 2b + Pz; 

$3 

so that (0g;/x,) = (sij+5s,)/2, we find from (6.5) that 


= 


1 
(6.6) Vea = — rij) + — 


39 


40 H. A. SIMMONS [January 


The introduction of the g; and the s;; (whose denominators involve ¢,*) does 
not require that ¢(x, z)=0 be representable in the form z=2;(x) (cf. the 
(m—1)-tuple integral in (6.9), in which A0 since 0#¢, =¢,A). 
The other three terms of M are 
( 1 
fod A? + 
(6.7) 
= — 94), 


2faVa (2¢/A) (f. + fod 


where 
(6.8) (f) = fat + 

Collecting the terms of M, as they are given by (6.6), (6.7), (6.8), and 
using the value (6.2) of faa, we obtain the following theorem. 


THEOREM 6.1. The second derivative I'’(0) of the n-tuple integral I(a) of 
equation (1.2), taken over the portion of the hypersurface z=2(x)+af(x) 
bounded by its intersection with the hypersurface o(x, z) =0, has the value 


n—1 


So Lo 
where 
22 = Suk? + 2; + S vin 28 25) 
(6.10) BS f(sig — + fOr + — 98) — 
Ci = 2[f(b; — — 


7. A boundary value problem associated with the second variation. We 
generalize the boundary value problem of the previous article. By Euler’s 
theorem on homogeneous functions, the m-tuple integral of (6.9) can be writ- 
ten in the form 


+- £2,2;)dx ; Q; = Qs 
So 


Then, after performing a customary integration by parts, we find 


n n 
f 2Qdx = f ae, = — Q. 
s So Ox; Ox; 


0 


Applying the extended Green’s theorem heretofore used, we now find 


Py 


1934] VARIATIONS OF n-TUPLE LIMITS 


n n 
2Qdx = d A,Qidu. 


From (6.9) and (7.1), we now obtain 


n— 


n 1 
= + + + AM,)du ; 
So 


Lo 
or since 
= Sfep, + Saf 


we have 

n 
7.2 I"(0) = d D Ex .,)du, 


where 
(7.3) D=B+Aifipy Ex = Cit (cf. (6.10). 

From (7.2) we can now state a new necessary condition in order that the 
hypersurface z =2(x) shall minimize the n-tuple integral (1.1). 


THEOREM 7.1. In order that the hypersurface z=2(x) shall minimize the 
n-tuple integral (1.1), it is necessary that for negative values af the boundary 
value problem 


¥(o) — AE = O in the region So, 
Di + Ege, = 0 on the boundary Lo of So 
have no solution except § =0, D and the E; being defined through (6.10) and (7.3). 
8. The minimal hypersurface. Here we define f=(1+ ::)"?. We shall 

compute the J’’(0) of (6.9) for the present case. Since f,=0, the only deriva- 
tives needed here are 

So, = Ps/f, fon, = (P63 — dibid/f*, fax = (cf. (6.8)), 

faa = 22 = + 2; — pip;)/f*. 
One now finds that the B, C; of (6.10) reduce to B’, C/ , respectively, where 

1 1 

B’ = f(siy — +4(— —)a + piri di — 


Pn—1 
Ci = — — 


Hence we have the following corollary of Theorem 6.1. 


(8.1) 


41 


42 H. A. SIMMONS . [January 


CoroLiaryY 8.1. In the case of the minimal hypersurface, the I''(0) of 
Theorem 6.1 reduces to 


= f — pide + + 
So 0 


where B’ and the C/ are as defined in (8.1). 
To make a similar specialization of Theorem 7.1, we observe that in the 
present case 
a a 
¥(5) = — pipi)/f*, 
Ox; Ox; 
D = B’, of (8.1), 
= Ci + — pips)/f* (cf. (7.3) and (8.1)). 
Consequently we have the following corollary to Theorem 7.1. 

Coro.iary 8.2. In order that the hypersurface z=2(x) shall minimize the 
n-tuple integral (1.1) in the case where f =(1+-p:p:)"?, it is necessary that for 
negative values of d the following boundary value problem have no solution except 


(8.2) 


0 
2; + AE = 0 on So, 


x 


Bt + Eg = I, 
9, = 
Ox; 


by (8.2), B’ is defined by (8.1), and the E; are given in (8.2). 

9. Further applications of Theorem 4.1. Since g in §4 is merely required 
to be a function of class C’’ in the x’s and the parameter a, there may be 
numerous applications of Theorem 4.1, even to more complicated variation 
problems than the problem associated with §§1-8 above. We have made one 
such application. We have used the first equation of (4.8) to compute the first 
variation of the integral 

K= f I(x, 2, p, r)dx 
8o 
where So, m, x, z, p have the meaning relative to K which they had for J in 
§§1-8; r is the set of all of the derivatives 


(i,j =1,-++,m); 


Ox; 


1934] VARIATIONS OF n-TUPLE LIMITS 43 


and f is supposed to have suitable continuity in a region W of the space 
XZPR in which a minimum value of K is desired. A fixed hypersurface 
¢(x, 2) =0, with suitable continuity, is employed as in §§1-8. 

We state without proof that the analogs of (5.4) and (5.9) here are (9.1) 
and (9.2) below, respectively: 


n—1 Séo.du 
(ba + 


(9.1) K’'(0) + 2; fog sh 252;)0% f 
L 


(9.2) K'(0) = as 


Ox,0%; 


| + + Pibs) — + pis) 
rf] = 
bat 
By methods that were used in preceding sections, one could compute 
K’’(0). 


NORTHWESTERN UNIVERSITY, 
EVANSTON, ILL. 


. 


CONSECUTIVE COVARIANT CONFIGURATIONS 
AT A POINT OF A SPACE CURVE* 


BY 
ABBA V. NEWTON 


I. INTRODUCTION 


A systematic study of the projective differential geometry of space curves 
was first made by Halphenf in a memoir of 1880. Wilczynskif in 1905 and 
1906 and Sannia§ in 1926 made important additions to the subject. 

The projective differential theory of a curve involves many configura- 
tions associated covariantly with the curve. The purpose of this paper is to 
make some contributions to the theory of a space curve, which are based 
upon the study of consecutive configurations. The work follows the lines of a 
similar investigation made by Lane|| for the case of a plane curve. 

We now make precise the meaning of the word “consecutive” as used in 
the present paper. Let us consider an analytic curve C in projective space of 
three dimensions. The equations of such a curve, in non-homogeneous pro- 
jective coordinates x, y, z, can be written in the form of two power series ex- 
pansions, 


which represent C in the neighborhood of the ordinary point P with coordi- 
nates 0, do, co; the neighborhood is supposed to be sufficiently small so that 
the series converge. If Q is a point with coordinates h, k, 1 on C near P, then 
h, k, | must satisfy the foregoing equations when substituted in place of x, 
y, 2, respectively: 


l= 


If, now, / is regarded as an infinitesimal, and if we are considering problems 


* Presented to the Society, June 22, 1933; received by the editors June 24, 1933. 

+ G. H. Halphen, Sur les invariants différentiels des courbes gauches, Journal de |’Ecole Poly- 
technique, vol. 28 (1880), p. 1. 

t E. J. Wilczynski, General projective theory of space curves, these Transactions, vol. 6 (1905), 
p. 99. 
E. J. Wilczynski, Projective Differential Geometry of Curves and Ruled Surfaces, B. G. Teubner, 
1906. 

§ G. Sannia, Nuova trattazione della geometria proiettivo-differenziale delle curve sghembe (memoria 
2), Annali di Matematica, (4), vol. 3 (1926), p. 1. 

|| E. P. Lane, On the projective differential geometry of plane curves, Tohoku Mathematical Journal, 
vol. 37 (1933), p. 423. 


44 


COVARIANT CONFIGURATIONS 45 


in which the powers of / higher than the first are negligible in comparison 
with the first power, we can drop all terms after the second in each series and 
write 


k = ao + L=¢o+ ch. 


In such a case the point Q is said to be “consecutive” to the point P. More- 
over, the tangent line of the curve at Q is said to be “consecutive” to the 
tangent line at P, and similarly for other corresponding covariant configura- 
tions associated with the points P and Q. 

For the development of the theory of consecutive configurations we need 
first to call to mind some fundamental facts from the projective differential 
geometry of a space curve. This is done in §II. In this section are described 
certain configurations covariantly associated with a point P of a space curve. 
These determine geometrically the vertices and unit point of a local coordi- 
nate system at the point P. Equations (1), when referred to this local co- 
ordinate system, reduce to the canonical form with which the section opens. 

In §III we consider the configurations associated with a point Q consecu- 
tive to the point P as those of §II are associated with P. Their equations re- 
ferred to the local coordinate system at the point P are determined. From 
these can be found the coordinates, in the same coordinate system at P, of 
the vertices and unit point of the loca! coordinate system associated with the 
point Q, that is, the consecutive local coordinate system. This makes it pos- 
sible to set up the equations of transformation between the two local coordi- 
nate systems as is also done in the section. Lastly, the equations of the curve 
C referred to the consecutive coordinate system are deduced. 

The application of this theory is made in §IV. There we consider three main 
types of problems. All of these have in common, how over, the feature that the 
point or curve to be determined depends upon t+ consecutive curves, or 
upon two consecutive points or surfaces, as the case may be. 


II. FUNDAMENTALS OF SPACE CURVE THEORY 


In the theory of consecutive covariant configurations associated with a 
space curve, we restrict ourselves, as we have already stated, to the case of an 
analytic curve C. It is provable in projective differential geometry that for 
such a curve equations (1), when referred to a particular covariant local co- 
ordinate system, assume the simple form 


y = x? + ax’ + + ex®+---, 
It is the purpose of this section to recall to the reader’s mind the facts which 


(2) 


46 A. V. NEWTON - [January 


give geometric significance to the vertices of the tetrahedron of reference and 
to the unit point of the particular coordinate system involved.* 

We consider on the curve C the point P which is the vertex (0, 0, 0) of 
the tetrahedron of reference. The tangent line to C at P has the equations 
y =z=0, and the osculating plane at P has the equation z=0. The equations 
of the osculating twisted cubic at the point P, that is, the twisted cubic having 
six-point contact with the curve C at P, has the equations 


(3) y= c= 
and its osculating conic at P, which by definition is the osculating conic of C 
at P, is given in homogeneous form by the equations 
(4) 4x1x3 — 3x? = x, = 0, 
where 
= x, = y, = 
The null system of the osculating cubic has the equations 
(S) & = = = — 


The bundle of quadric surfaces having seven-point contact with the curve 
C at the point P is represented by the equation 


aly — + B(y? — 2x) + — xy — 2?) = 0,7 


in which the coefficients a, 8, y are arbitrary constants. These quadrics have 
as an eighth point of intersection, the point of Sannia, which, in homogeneous 
coordinates, is the point (1, 0, 0, 1). In this bundle of quadrics there is one 
cone, called the osculating quadric cone, whose vertex is at the point P. Its 
equation is 


(6) y? — 2x = 0. 


All the quadric surfaces which pass through the osculating cubic of the 
curve C at the point P form another bundle with the equation 


a(y — x*) + B(y? — 2x) + y(2 — xy) = 0, 


in which again a, 6, y are quite arbitrary. Two cones of this bundle are also 
seven-point cones. One of these is the osculating quadric cone; the other 
has the equation 


(7) y— 0; 


* For a convenient reference for the derivation of the results mentioned in this section (with the 
exception of the surface of Calapso) sce E. P. Lane, Projective Differential Geometry of Curves and Sur- 
faces, University of Chicago Press, 1932, pp. 20-25. 


1934] COVARIANT CONFIGURATIONS 47 


its vertex is at the point (0, 0, 0, 1) called the Halphen point corresponding to 
the point P of the curve C. 

We next recall the definitions of the principal plane and of the principal 
point of the tangent. Let the curve C and its osculating cubic at the point P 
be projected onto their common osculating plane z=0 from a point not on 
that plane. In general, the projections have six-point contact. If, however, the 
center of projection lies in the plane y=0, they have seven-point contact. 
This plane y=0 is called the principal plane at the point P of the curve C. 
The point (0, 1, 0, 0), which corresponds to the principal plane in the null 
system of the osculating cubic given by equations (5), is the principal point 
of the tangent line at P. 

The polar line of this principal point with respect to the osculating conic 
of the curve C at the point P meets the osculating conic in P and in the point 
(0, 0, 1, 0). This point together with the principal point and the point of 
Sannia (1, 0, 0, 1) determines the plane x;—2,=0, one of whose three inter- 
sections with the osculating cubic is the unit point (1, 1, 1, 1). 

Lastly, a configuration associated with a point P of the curve C, but not 
assisting in the characterization of our coordinate system, is the surface of 
Calapso.* It is the locus of the vertices of the six-point quadric cones at P of 
the curve C. Its equation, in homogeneous coordinates since that is the form 
in which we shall make use of it, is known to be 


(8) — = O. 


This surface is, in fact, a cubic ruled surface of the Cayley type, sometimes 
called a Cayley cubic scroll. 


III. TRANSFORMATION OF CONSECUTIVE LOCAL COORDINATES 


With every point on the curve C there may be associated a coordinate 
system which is related to the point in the same geometric manner as the 
coordinate system just described is related to the point P. In particular, 
there is such a coordinate system, which we shall refer to as the consecutive 
coordinate system, associated with the point Q consecutive to P. In this section 
we wish to find the equations of transformation between the original co- 
ordinate system at P and the consecutive one at Q. 

To accomplish our purpose, we make use of a suitable auxiliary trans- 
formation of the coordinates x, y, z into new coordinates &, , ¢ in an auxiliary 
coordinate system having the point Q as origin (0,0, 0). When the equations 
of the curve C have been transformed to these new coordinates, we are ready 


* R. Calapso, Sulle superficie gobbe di terzo grado (del tipo di Cayley) legate al punto di una data 
superficie, Rendiconti dei Lincei, (6), vol. 13 (1931), p. 495. 


48 A. V. NEWTON [January 


to obtain in the coordinates £, n, ¢ the equations of the configurations related 
to the point Q as those of the preceding section are related to the point P. 
The inverse of the auxiliary transformation already used now permits us to 
transform these equations back to the original coordinate system, and hence 
we can find the coordinates in the xyz-system of the vertices of the consecu- 
tive tetrahedron of reference and of the consecutive unit point. Since five 
points, no four of which are coplanar, whose coordinates in each of two 
systems are known, are sufficient to determine the transformation between 
the two systems, the transformation between the original and the consecu- 
tive coordinate systems can now be established. 

We introduce first the auxiliary transformation and then apply it to the 
equations of the curve C. Let us impose on the new coordinate system, be- 
sides the condition that the point Q consecutive to the point P be the origin 
(0, 0, 0), the further conditions that the line 7 =¢ =0 be the tangent to C at Q, 
and that the plane ¢ =0 be the osculating plane. We consider what these con- 
ditions imply for the coordinates x, y, z. For a point with coordinates h, k, l 
near the point P on the curve C, equations (2) tell us that 


k= + ah? + + +--+, 
L=B+P+ch'+ 


If, now, the point is Q, the point consecutive to P on C, so that we can neglect 


powers of h higher than the first, the coordinates x, y, z of Q become h, 0, 0. 
The general equations of the tangent to C at a point (%, 9, 2) are 


y—-9—(x— = 0, — (x — = O,7 
where 9’ and 2’ mean dy/dx and dz/dx, respectively, evaluated at the point 
(#, 9, 2). Hence, the equations of the tangent to C at Q(h, 0, 0) are 

y — 2hx = 2 = 0, 
since we neglect powers of / higher than the first. The general equation of the 
osculating plane to C at a point (%, 9, 2) can be written in the form 
— 5 — (x — 

_ 292" _ 


y 


(x — #)2’ 


For the point Q, this becomes 
3hy =0. 
Summarizing results, we find that the point (h, 0, 0) in the original coordinate 


system becomes the point (0, 0, 0) in the new; the line y—2hx =2=0 in the old, 
becomes the line n={ =0; and the plane z—3hy =0 becomes the plane § =0. The 


1934] COVARIANT CONFIGURATIONS 49 


auxiliary transformation is completely determined by these relationships; its 
equations are 


(9) t=x-—h, n= y—2hx, 3hy, 

or, in homogeneous form, 

(10) of: = %, of = %2— hx, = %3 — of = x4 — 3hxs, 
where o is a proportionality factor. The inverse of the non-homogeneous form 
is 

(11) x=t+h, y=2ht+n, 2=3hnt+ 


When equations (2) are subjected to transformation (11), we obtain the 
equations of C referred to the new coordinate system, namely, 


n= & + + (a+ +---, 
B+ + (1 + + [c + — 

Our next problem is to determine the covariant configurations at the 
point Q, by means of which the vertices of the consecutive tetrahedron of 
reference and the consecutive unit point may be characterized. We first con- 
sider the consecutive osculating cubic. Its parametric equations in homo- 
geneous coordinates are 


(13) &=t+6h?, &=?, & =F. 


We shall verify this by showing that these equations satisfy the equations 
(12) of the curve C through terms in £. By setting 


we obtain the non-homogeneous form of equations (13), namely, 
(14) 6h, — 12h, =F — 12h8. 


(12) 


If we invert the first of these we get 


When this expression for ¢ is substituted in the last two of equations (14), we 
have the non-homogeneous equations of the osculating cubic: 


$= & + 


which obviously coincide with equations (12) through terms in &. Referred 
to the original coordinate system we may write the equations of the consecutive 
osculating cubic in the form 


50 A. V. NEWTON. [January 


To introduce the consecutive osculating conic, let us first consider the 
tangent line to the cubic at a point &. It intersects the osculating plane &=0 
of the cubic at the point Q in one point. The locus of this point of intersection 
as the point £ varies over the cubic is by definition the osculating conic 
of the cubic at the point Q; it is also called the osculating conic of the curve 
C at the point Q, or the consecutive osculating conic. Since the tangent line 
is determined by the points £ and &’ whose coordinates are given respectively 
by equations (13) and by the equations obtained by differentiating (13) with 
respect to ¢, its parametric equations are 


= 1 + + = t+ + A(1 + 18h72°), 

= + & = + 
It meets the plane &,=0 in the point with coordinates 

= 1+ & = 2t/3, & = & =0. 
The locus of this point as £ varies along the cubic is found, by eliminating ¢ 
from these equations, to be 
4t,t; — 48ht? — 3t7 = = 0. 

Making use of equations (10), we obtain the equation of the consecutive osculat- 
ing conic referred to the original coordinate system: 
(16) 4x3x%3 — 2hxyx. — 48hx? — 3x? = x4 — 3hx3 = 0. 

We shall next interest ourselves in the determination of the bundle of 
quadric surfaces having seven-point contact with the curve C at the point Q 
and of their eighth point of intersection, which is the consecutive point of 
Sannia. Let us write the general equation of the second degree in &, , ¢ and 
impose the condition that it be satisfied identically in £, through terms of the 
sixth degree, by the power series (12) for 7 and ¢. This gives for a general 
one of the seven-point quadrics, the equation 
a(n — — + B(n? — & + 6hg?*) 

+ + — + h(7ch? + 6nt)] = 0, 
in which a, 8, y are arbitrary constants. After making use of transformation 
(9) we have the result that the equation of the bundle of seven-point quadrics 
referred to the original coordinate system is 
a(y — x? — 7ahs*) + Bly? — xz + h(z + 62? — xy)] 
+ y[xy + 2? — 2 + A(7c2? + 6yz — 4y — = 0. 


(17) 


The eighth point of intersection of all the quadrics of this bundle can be deter- 


1934] COVARIANT CONFIGURATIONS 51 


mined as the eighth intersection point of the three particular seven-point 
quadrics whose equations are, respectively, 


y — x? — Tah2? = 0, 
(18) y? — xz + h(z + 62? — xy) = 0, 
xy + 2? — 2+ h(7c2? + 6yz — 4y — 2x”) = 0. 


It is easy to solve these equations if we notice that, since for s=0 the eighth 
solution must be (0, 0, 1), we can suppose that our required solution will be 
of the form (hm, hn, 1+-hr). When we substitute these expressions for x, y, z 
in equations (18) and neglect powers of # higher than the first, we obtain 
simple equations which readily give (7h, 7ah, 1—7ch) as the eighth solution 
of equations (18). In homogeneous coordinates, then, the consecutive point of 
Sannia referred to the original coordinate system is the point 


(1, 7h, 7ah, 1 — 7ch). 


Among the seven-point quadrics there is one cone having its vertex at the 
point Q. It is found by making equation (17) homogeneous and imposing the 
condition that the four first partial derivatives of the left member be zero at 
the point (1, 0, 0, 0). In this way we get the conditions a= =0. When these 
values are substituted in equation (17), the equation 


— & + = 


is obtained. Hence, the equation of the osculating quadric cone referred to the 
original coordinate system is 


(19) y? — xz + h(z + 62? — xy) = 0. 


We next consider the bundle of quadrics through the consecutive osculat- 
ing cubic, and determine in it the cone different from the osculating quadric 
cone which is also a seven-point cone. Its vertex will be the consecutive Hal- 
phen point. We write again the general equation in &, 7, ¢ of a quadric surface. 
This time we demand that it be identically satisfied in ¢ by equations (14) of 
the consecutive osculating cubic. The result is 


(20) a(n — + B(n? — & + + — + = O, 


wherein a, 8, y are arbitrary constants. To find the condition that one of 
these quadrics be also a seven-point quadric, we compare the left members 
of this equation and of equation (17), thus obtaining the relation 


y = Taha 


between the arbitrary constants. If, moreover, this quadric is to be a cone, 


52 A. V. NEWTON 


we have a further relation 

aB = — 12a°7h, 
obtained by setting the discriminant of (20) equal to zero and simplifying by 
means of the first relation. If a=0, then y =0 and the cone is the osculating 
quadric cone, with which we are not at the moment concerned. Hence, for 
our desired cone we know that 6 = —12ah, since a0. Therefore, the equa- 
tion of the cone is 


n — & + 12h(& — + 7ah(& — $) = 0, 
or, in the coordinates x, y, 2, 
(21) y — x? + 12h(xz — y*) + Tah(xy — z) = 0. 


To find the coordinates of its vertex we write this last equation in homoge- 
neous form and set the four first partial derivatives of the left member equal 
to zero. The solution (0, 64, 7ah, 1) of the four equations so obtained is the 
vertex of the cone or the consecutive Halphen point. Summarizing, we can 
state that the seven-point cone through the consecutive osculating cubic, which 
is not the osculating quadric cone, is given in the original coordinates by equation 
(21). Its vertex, the consecutive Halphen point, has the coordinates 


0, 6h, 7ah, 1. 


Our next concern will be with the consecutive principal plane. To find it 
we first determine the equations of the projections of the curve C and of the 
consecutive osculating cubic onto the osculating plane ¢=0, the center of 
projection being a general point not on the plane. Then we find the condition 
that these projected curves have seven-point contact. The equations of a line 
joining the point (£, 4, €) of C and any point (a, 8, y) not in the plane ¢ =0 are 


E=a+t (E—a)p, 
n=B+ B)p, 
where p is a parameter. By setting p= —y/({—~7) the equations of the pro- 
jection of the curve C from the point (a, 8, y) onto the plane ¢ =0 are found 
to be 
—7). 


Upon expanding the right members of these equations and replacing 4 and 
¢ by their values in terms oi — given by equations (12) of the curve C, we 


[January 


1934] COVARIANT CONFIGURATIONS 


obtain the form 


= — + — — [a + y(a + Tach — 6h)]B/y* — , 


= — — (6hB — 1)8/y — [8 + By(1 + 7ch) — —---. 
When the first of these is inverted and the power series for £ in terms of £ so 


obtained is substituted for = in the second, the resulting equation of the pro- 
jection of C onto the osculating plane ¢ =0 is 


= — BE/y + 2att/y — [(1 + 6hB)y + 
+ [7a? + 28 — By + h(7ay? + 12ay — 
By a similar procedure, the projection of the osculating cubic onto the plane 


¢=0 is found. It proves to be the same as the projection of C through terms 
of the sixth degree in £, except that in the coefficient of £* the terms 


[— By + h(7ay? — 


are missing. Hence, if the two projections are to have seven-point contact, 
this expression must vanish. This means that 


B + 7chB — 7ahy 


must equal zero, for we know y #0 since the center of projection does not lie 
in the plane ¢=0. This relation between the coordinates 8 and y merely im- 
plies that the center of projection must lie in the plane with the equation 


(1 + 7ch)n — 7aht = 0. 


This plane is by definition the consecutive principal plane. After applying 
transformation (9), we have the equation of the consecutive principal plane in 
the coordinates x, y, 2, namely, 


2hx — (1 + 7ch)y + Tahz = 0. 


As the consecutive principal point is the point corresponding to the con- 
secutive principal plane in the null system of the consecutive osculating cubic, 
it will now be necessary to determine this null system. The osculating plane 
at any point £ of the cubic is found, by differentiating (13) twice and writing 
the equation of the plane determined by &, ¢’, ’’, to be 


(22) — + (3t — — (1 — = 0, 
where ¢ is the value of the parameter corresponding to the particular £ chosen. 


Since this is a cubic equation in ¢ there are in general three values of ¢ which 
will satisfy it for any arbitrarily chosen values £1, £2, &, 4. Therefore, through 


54 A.V.NEWTON . [January 


any point 7 of S; there pass three osculating planes of the osculating cubic. 
If we let 4, f, 4: be the parametric values corresponding to the three points of 
osculation, we can easily write the equation of the plane determined by these 
points. When it is simplified by means of the values obtained from equation 
(22 ) of the elementary symmetric functions of h, te, ts, it takes the form 


— + (3n2 — 30hns)Es — (1 — 30Kms)Es = O. 


By applying transformation (10) we reach the corresponding equation in the 
old coordinates, namely, 


— + — x3 — (91 — x4 = 0. 
This shows us that the equations of the null system of the osculating cubic are 
ys, —3ys, = 3y2 — & = — 1 + 30KYs, 


where £1, - - - , &, are now the coordinates of the plane corresponding to the 
point with the coordinates y. From these equations it is easy to derive the 
result that the coordinates of the consecutive principal point referred to the orig- 
inal system are 


21ah, 1 + 7ch, 2h, 0. 


There is one more vertex of the consecutive tetrahedron of reference 
whose coordinates relative to the xyz-system are to be determined. The polar 
line of the point (21ah, 1+7ch, 2h, 0) with respect to the conic given by 
equation (16) has the equation 


hx, — x2 + 14ahx3 = 0. 
Solution of this equation with (16) gives 
12h, 14ah, 1, 3h, 


as the coordinates referred to the original system of the point distinct from the 
point Q, in which the polar line of the consecutive principal point with respect 
to the consecutive osculating conic meets the conic. This is the required vertex. 

Finally, in order to determine the equations of transformation between 
the original coordinate system and the consecutive one, it is sufficient, with 
the information we already have, to know the coordinates in the first system 
of the unit point. The plane determined by the consecutive point of Sannia, 
the consecutive principal point, and the point distinct from the point Q in 
which the polar line of the consecutive principal point with respect to the 
consecutive osculating conic meets the conic is found to be 


— 2lahx, — — (1 + 7ch)x, = 0, 


1934] COVARIANT CONFIGURATIONS 


or in non-homogeneous coordinates, 
1 — 21ahx — 9hy — (1 + 7ch)z = 0. 


Solution of this equation with equations (15) for the consecutive osculating 
cubic is made simple by assuming the value of x to be of the form 1+rh, an 
assumption which is permissible since we already know the required solution 
has x=1 for the case h=0. When the result is expressed in homogeneous 
coordinates we have for the consecutive unit point referred to the original co- 
ordinate system the point 


[1,1 — A(S + 7a + 7c/3), 1 — h(10 + 140 + 14c/3), 1 — K(9 + 21a + 7c)]. 


We are now ready to derive the equations of transformation between the 
coordinates %1, %2, %s, x, in the original system and the coordinates, which we 
shall denote by X;, X2, X3, X4, in the consecutive system, since we have found 
the coordinates, in each of the two systems, of five points no four of which are 
coplanar, namely, the four vertices of the consecutive tetrahedron and its 
unit point. We write the general linear equations of transformation of the 
coordinates x into the coordinates X. Substitution in these equations of the 
coordinates of each of the five pairs of corresponding points yields twenty 
equations homogeneous in the sixteen constants of the transformation and in 
five proportionality factors. When the coefficients of the transformation are 
determined from these equations, we obtain, after simplification, the follow- 
ing result: 

The equations of transformation from the original system to the consecutive 
system are 


px, = Xi+ 21ahX2 + 12hX3, 
px, = hX,+ (1 —7ch/3)X_+ 14ahX; + 6hX4, 
pt, = + (1 — 14ch/3)X3 + 7ahX., 
pm = 3hX3 + (1 — 7ch)X,, 


(23) 


where p is a proportionality factor. 
In non-homogeneous coordinates they become 


x= X+ h(1 — 7cX/3 + + 6Z — 21aX? — 12XY), 
(24) y = ¥ + A(2X — 14c¥/3 + 7aZ — 21aXY — 12Y%), 
2=Z+ h(3Y — 7cZ — 21aXZ — 12YZ). 
By interchanging x, x2, %3, x, with X1, Xe, X3, X4, respectively, in equations 


(23), and changing the sign of h, the inverse transformation in homogeneous 
coordinates is readily found to be given by the equations 


55 
- 


A. V. NEWTON -° [January 


0X, = 21lahx, — 12hxs, 

aXe hx, + (1 + 7ch/3) x2 14ahx3 6hx,, 
oX3 = 2hxe + (1 + 14ch/3) x3 Tahx,, 
oX, = 3hxs + (1 + 7ch) x4, 
where @ is a factor of proportionality. 

It is now possible to determine the equations of the curve C referred to 
the consecutive coordinate system. In equations (2) we substitute for the 
variables x, y, z their values in terms of X, Y, Z as given by equations (24). 
In the resulting equations we replace Y and Z by power series in X with un- 
determined coefficients. Since we now have two identities in X, the coefficient 
of each power of X can be equated to zero. Solution of the equations so ob- 
tained, for the coefficients of the power series representing Y and Z, permits 
us to write for the equations of C in the coordinates X, Y, Z, 


VY = X?+AX7+ BX*+---, Z= X*4+CX74 DX*+---, 
where A, B, C, D are defined by 
A =a-+ h(12 + 8b — 56ac/3), 
B= b+ A(12¢ — 7ad — 14bc + %), 
C = c+ h(8d — 24a — 28c?/3), 
D = d+ h(9g — 28ac — 6 — 35cd/3 — 3b). 


(25) 


(26) 


By means of transformation (25) and the relations (26) we can now find 
the equation, or equations, of the locus consecutive to a given locus, not 
only from the geometric definition of the locus, as we have been doing 
throughout this section, but also by a more direct method. We write equa- 
tions identical with the homogeneous form of those of the given locus, except 
that x1, x2, %3, x, are replaced by Xi, X2, Xs, X4, and a, b, c, d,--- by 
A, B,C, D, - - - , and then apply transformations (25) and (26). It is obvious 
that the method to be used to obtain the point consecutive to one whose 
coordinates are known, is to substitute the given coordinates in equations 
(23). 

IV. APPLICATIONS OF THE TRANSFORMATION 


§IV is concerned with some applications which can be made of the results 
obtained in the preceding section. The problems we shall consider are of three 
types. The first of these is the determination of the tangents of the loci of 
various covariant points associated with a point P on the curve C. We have 
already found the consecutive points to a number of covariant points, as, for 
example, the vertices of the original tetrahedron of reference. Each covariant 


1934] COVARIANT CONFIGURATIONS 57 


point with its consecutive point determines a line which is the tangent at 
the point to its locus as the point P moves along C. Such covariant curves as 
intersect their consecutive curves have envelopes generated by the inter- 
section points. The envelopes are the edges of regression of the surfaces gen- 
erated by the curves, and the intersection points are the focal points on ‘the 
edges of regression. Our second problem is to find out which covariant curves 
that we have discussed have envelopes and to determine their contact or 
focal points. In the third kind of problem we determine the characteristic 
curve of a covariant surface, that is, the curve of intersection of a surface 
with its consecutive surface. We go yet farther than this and find the focal 
point of the edge of regression of the envelope of the surface, that is, the 
point in which a characteristic curve intersects its consecutive curve. 

We turn now to the determination of the tangents of the loci of some co- 
variant points. Obviously, the locus of the point (1, 0, 0, 0) is the curve C 
itself and its tangent at any point is the tangent to C at that point. The points 
consecutive to the other three vertices 


(0,1,0,0), (0,0,1,0), (0, 0,0, 1) 
of the tetrahedron of reference are, respectively, 
(21ah, 1+ 7ch,2h,0), (12h, 140k, 1, 3h), (0, 6%, 7ah, 1). 
Hence we find for the tangents to the respective loci of the points (0, 1, 0, 0), 
(0, 0, 1, 0), (0, 0, 0, 1), the lines whose equations are 
2x, — 2laxs = x, = 0, 

7ax, — 6x2 = x; — 4x, = 0, 

— 6x3 = x, = 0. 
The point 
[1,1 — A(S + 7a + 7c/3),1 — + 140 + 14¢/3),1 — + 21a + 7c)] 


is consecutive to the unit point (1, 1, 1, 1); together these points determine, 
as tangent of the locus of the unit point, the line 


+ x3 = (2R 6)x (3R 6) x2 + Rx, = 0, 
where R is defined by 
R=5+7a+ 7c/3. 


The point consecutive to the point of Sannia (1, 0, 0, 1) is the point (1, 7h, 
7ah, 1—7ch). This gives us that the tangent to the locus of the point (1, 0, 0, 1) 
is the line 


A. V. NEWTON [January 
— X%3 = — CX3 — ax, = O. 


As an example of the second type of problem we consider first the case of 
the osculating conic, whose equations are given by (4). Solution of these equa- 
tions with equations (16) of the consecutive osculating conic gives the point 
(1, 0, 0, 0) for the only intersection of the two curves. Therefore, the osculating 
conic has no envelope but the curve C itself. The case of the osculating cubic 
proves more fruitful. If we make equations (3) and (15) homogeneous and 
solve them simultaneously we find that the contact points of the osculating cubic 
with its envelope are the points (1,0, 0,0) and (0, 0, 0, 1). 

We now investigate the situation for some covariant lines. It is obvious 
that the envelope of the tangent line is the curve C. Let us consider some other 
edge of the tetrahedron of reference, as the line x: =%,=0. Its consecutive 
line is X, = X2=0, or by applying (25), 


— 2lahxe — 12hx3 = — hay + (1 + 7ch/3)x2 — 14ahx3 — = 0. 


There is no common solution of these four equations, and hence the line 
%1 = 2 =0 does not intersect its consecutive line. In a similar way we find that 
the other edges of the tetrahedron of reference are skew to the corresponding 
edges of the consecutive tetrahedron. There are many other covariant lines 
we might consider, for example, the lines joining the point of Sannia to the 


vertices (0, 1,0, 0) and (0, 0, 1, 0), and to the unit point, and the lines joining 
the unit point to the vertices of the tetrahedron. In each case we write the 
equations of the line determined by the two points considered, then write 
identical equations only with coordinates X instead of coordinates x and ap- 
ply transformation (25). In every instance we find that the four equations so 
obtained have no common solution. We may sum up these findings as follows: 

The tangent line has the curve C for envelope, but the other edges of the tetra- 
hedron of reference, the lines joining the point of Sannia or the unit point to a 
vertex of the tetrahedron, and the line joining the point of Sannia with the unit 
point do not generate developable surfaces and hence have no envelopes. 

Our third kind of problem is concerned with the characteristic curves and 
edges of regression of the envelopes of covariant surfaces. Let us take first 
the plane x, =0. The equation of the consecutive plane is X; =0 or, after mak- 
ing use of (25), 


— 2lahx, — 12hx3; = 0. 
The characteristic line of the plane x, =0 is then the intersection of these two 
planes, which is the same as the intersection of the two planes 
(27) = 0, 4x3 = 0. 


1934] COVARIANT CONFIGURATIONS 59 


To find the point at which the plane x, =0 touches the edge of regression we 
must know the consecutive characteristic line. Since the characteristic line 
is a covariant configuration and is determined by the two planes (27), its 
consecutive characteristic line is determined by the two consecutive planes 


X, = 0, 7AX_+ 4X3 = 0. 


The first of these gives us nothing new, but when the second has been sub- 
jected to transformations (25) and (26) we find that the point where the plane 
%,=0 touches the edge of regression is determined by the system of planes 
with the equations 


41> 0, 7axX2 + 4x3 = 0, 
(76 + 56b — 147ac)x, — 98a*x3 — 70ax, = 0. 
After solving these equations, we state our conclusions: the characteristic line 


of the plane x,=0 has the equations (27); the focal point of the edge of regression 
of the developable of this plane has the coordinates 


0, 140a, — 24502, M, 
where M is defined by 
(28) M = 152 + 1126 — 294ac + 343a5. 


In a similar way we find the characteristic lines and points of contact with 
the edges of regression of the developables of other covariant planes. The 
characteristic lines of the planes x.=0, x3=0, x4=0, and of the plane x,—x,=0 
determined by the point of Sannia and the two vertices (0,0, 1,0) and (0, 1, 0, 0) 
have, respectively, the equations 


Xe = + x43 = 0, 
— = 21axe + 9x3 + Tex, = O. 
The respective focal points of the developables of these planes are 
(3M — 42 — 343a%, 0, 21c — 4922, N), 
(3 — N, — 7a, 0, 2), (1, 0, 0, 0), 
(3M — S, 49c? — 343a°%c — 84d + 567a, 196ad — 1323a? — 273c 
— 196bc + 343a?, 3M — 5S), 
where M is as defined in equation (28) and N and S are given by 
N = 98ac — 566 — 69, S = 105 + 84b — 294ac. 


The characteristic curve of the osculating quadric cone is the curve of 


60 A. V. NEWTON | [January 


intersection of the two quadric surfaces whose non-homogeneous equations 
are (6) and (19). This is equivalent to the curve of intersection of the first sur- 
face and the surface with homogeneous equation 


— — Ox? = 0; 
it is composed of the tangent line x; = x,=0 and the cubic curve 
(29) = — 6, = x3 = 1, %=1. 


Hence, we state the conclusion: the characteristic curve of the osculating quadric 
cone of the curve C at the point P consists of the tangent line to the curve C at the 
point P and the cubic (29). This cubic has no contact with the curve C at the 
point P. The cone has seven-point contact with its edge of regression at (1, 0,0, 0) 
and touches it again at the point 


(3433 — 750, 245c2, 175c, 125). 


From equations (7) and (21) made homogeneous we find that the inter- 
section of the two quadric surfaces 


— xf = 0, 12( xox, — x?) — — xex3) = O 


forms the characteristic of the seven-point cone with vertex at the Halphen 


point. As we should expect since this cone also contains the osculating cubic, 
the osculating cubic makes up a part of the characteristic, the remainder 
being the line whose equations are 


(30) 7ax,;— 12x, =0, Tax, — 12x; = 0. 


So we state the following result: 


The characteristic of the seven-point cone with vertex at the Halphen point 
corresponding to the point P of the curve C consists of the osculating cubic to C at 
P and of the line with equations (30). This cone has five-point contact with its 
edge of regression at (1, 0, 0, 0), ‘wo-point contact at 


(1728, 1008, 588a?, 


and single contact at 
(1728, 1008a, 5882, 13720 — 247), 
where T is defined by 
T = 96 + 56b — 147ac¢. 


The characteristic curve of the surface of Calapso whose equation is 


1934] COVARIANT CONFIGURATIONS 


given by (8) is the intersection of the two cubic surfaces 


3xex3X4 + 2x2 = 0, 


xy — + — = 0; 


it, also, includes the tangent line x;=x,=0 as a part. 

A second procedure for determining characteristic curves is suggested by 
the fact, which we recall from differential geometry, that the characteristic 
curve of a surface belonging to a one-parameter family has for its equations 
the equation of the surface and the derivative of that equation with respect 
to the parameter. To use this method we see at once that we need differentia- 
tion formulas for x, %2, %3, x, with respect to the parameter /. These are 
deduced from equations (25) in which g is taken to be 1. In each equation the 
term on the right which is free of / is transposed to the left; then both mem- 
bers of the equation are divided by #. Taking the limits as h approaches 0, 
we obtain the formulas of differentiation for 21, %2, X3, x4 with respect to h: 


= —2lax, — 12%3, 
— —140x3 — 
— 2x. 14cx3/3 — 
= — 3x3 + 


(31) 


where x; denotes the derivative of x, with respect to 4, and so on. As an 
instance of the application of these formulas to the solution of our problem 
we shall consider the osculating quadric cone with equation 


— = 0. 
Differentiation of this equation gives 


— — x4 = 0. 


Substitution from (31), followed by simplification by means of the equation 
of the osculating quadric cone, corroborates our conclusion of a previous 
paragraph, that the characteristic curve of this quadric is its intersection 
with the quadric =0. 

Finally, we mention the cross ratio of four lines in the plane x,=0 
through the Halphen point (0, 0, 0, 1). The two edges of the tetrahedron of 
reference 


x1 = x2 = 0, x, = x; = 0, 


the tangent to the locus of the point (0, 0, 0, 1) with equations 


A. V. NEWTON 
x1 = 7ax, — 6x3 = 0, 
and the characteristic line of the plane x, =0 with equations 


= 7aX2 + 4x3 = 0, 


have cross ratio equal to —3/2. 


UNIVERSITY OF CHICAGO, 
Cuicaco, IL. 


ANALYTIC EXTENSIONS OF DIFFERENTIABLE 
FUNCTIONS DEFINED IN CLOSED SETS* 


BY 
HASSLER WHITNEYt 


I. DIFFERENTIABLE FUNCTIONS IN CLOSED SETS 


1. Introduction. Let A be a closed set, bounded or unbounded, in eu- 
clidean n-space E, and let f(x) be a function defined and continuous in A. 
It is well known that this function can be extended so as to be continuous 
throughout £.{ If A satisfies certain conditions, the solution of the Dirichlet 
problem is a function harmonic in E—A and taking on the given boundary 
values in A. Two questions which arise are the following: Is there always a 
function differentiable, or perhaps analytic, in E—A, and taking un the given 
values in A? If the given function f(x) is in some sense differentiable in A, 
can the extension F(x) be made differentiable to the same order through- 
out E? 

These questions are answered in the affirmative in Theorem I. We use a 
definition of the derivatives of a function in a general set which arises nat- 
urally from a consideration of Taylor’s formula. In Part II, a differentiable 
extension of f(x) is found, whether f(x) is differentiable to finite or infinite 
order. Part III is devoted to some general approximation theorems. It is well 
known that a continuous function in a bounded closed set can be approxi- 
mated uniformly (together with any finite number of derivatives) by poly- 
nomials; we show that functions defined in open sets may be approximated 
(together with derivatives) by analytic functions, the approximation being 
closer and closer as we approach the boundary of the set. This theorem, to- 
gether with the results of Part II, furnish an immediate proof of Theorem I. 
In Part IV we give some extensions of Theorem I; in particular, we show that 


* Presented to the Society, December 29, 1932; received by the editors March 29, 1933, and, 
after revision, May 2, 1933. 

t National Research Fellow. 

¢ See references in a paper by P. Urysohn, Mathematische Annalen, vol. 94 (1925), p. 293, 
footnote 51. 

A continuous extension the author has not seen in the literature may be given as follows; we 
assume for simplicity that A is bounded. Let h(r) (r2=0) be a continuous and monotone increasing 
function such that 4(0)=0, and if x and y are any two points of A whose distance apart is rzy, then 
| f(x) —f(y) | Sh(rey). For any points x of E and y of A, set H(x, y)=f(y)—A(rey); then if x isin A, 
(x,y) Sf (x). The continuous extension of f (x) is F(x), which at each point x of E equals the maxi- 
mum of H(x, y) as y varies over A. 


63 


64 HASSLER WHITNEY [January 


the extension of f(x) may be made analytic at the isolated points of A. The- 
orem III includes all preceding results but Lemma 7. 

2. Notations. We shall write all equations involving m variables as if 
there were but a single variable present. For instance, we write 


So(x) for fo... 
Qk rts thn 
k ae ), 
Xn n 


Di f(x’) for 


(1) # 


etc. For any n-fold subscript k, we put 
+ kp. 
Note that o44:=0%+01. rzy will always denote the distance between x and y 
(unless x and y are complex). As an example, (3.1) below is short for 


S m—(ky+- ++ +kn) 


3. Differentiable functions in subsets of E. Let f(x) be defined in the set 
A, and let m be an integer 20. We say f(x) =fo(x) is of class C™ in A in terms 
of the functions f,(x) (ox m) if the functions f,(~) are defined in A for all 
n-fold subscripts k with o,<m, and 


(3.1) ihe 


I! 


(a! — x)! + Ri(x’; x) 


for each f;(x) (ox Sm), where R(x’; x) has the following property. Given any 
point x° of A and any e>0, there is a >0 such that if x and x’ are any two 
points of A with r,,,<6 and r,,,,<6, then 


(3.2) | Re(x’; x) | S 


One might define the derivatives of a function at the points of a set B, 
when the function is defined in a larger set A. We shall not do this here. 

If m=O, (3.1) and (3.2) state merely that f(x) is continuous. Note that 
the conditions are satisfied automatically at all isolated points of A, no mat- 
ter how the f;(x) are defined there. 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 65 


It is easily seen that the f,(x) are continuous in a neighborhood of each 
point of A, and are thus bounded there. From this we prove that if f(x) is of 
class C™ in A in terms of the f;(x) (oi Sm), thenit is of class C™’ in A (m'’ <m) 
in terms of the f(x) Sm’). 

Any function we shall say is of class C-! in A. f(x) is of class C® in A in 
terms of the f;(x) (defined for all &) if it is of class C™ in A in terms of the 
f(x) (0. for each m. 

Suppose f(x) is defined throughout the region R, and is of class C™ in 
terms of the fi(x) (o,<m). Then putting x=(m,---, =(m,---, 
Xn), (3.1) gives 


(h 


(provided o,<m), where %)/Ah—0 as Ah-—0, which shows that 


in R; thus in this case, f(x) is of class C™ in the ordinary sense, and the f(x) 

are the partial derivatives of f(x). The converse is true, by Taylor’s Theorem. 

4. The main theorem of the present paper is the following: 

TuHeEoreEM I. Let A be a closed subset of E, and let f(x) =fo(x) be of class C™ 
(m finite or infinite) in A in terms of the f(x) (ox m). Then there is a function 
F(x) of class C™ in E in the ordinary sense, such that 

(1) F(x) =f(x) in A, 

(2) D. F(x) =fi(x) in A 

(3) F(x) is analytic in E—A.t 

Of course (2) includes (1). 

No such theorem holds if we leave out the uniformity condition on 
R,(x’; x), i.e. if we assume merely that for any x and e>0 there is a 5>0 such 
that if r,,,<6, then | Ri(x’; x) |<r%-**e. The following example shows this. 
Let A be the set of points (using one variable) x=0, 1/2* and 1/2*+1/2* 
(s=1,2,---). Set f(x) =0 at x=0 and 1/2', and f(x) =1/2” at the remaining 
points. Set f:(x)=0 in A. The above condition is satisfied, but there is no 
extension of f(x) which has a continuous first derivative. 

5. The following lemma will be needed; its proof is elementary. 


+ It is seen from the proof in §16 that F(x) is analytic in a complex region with the following 
property. If x is a point of E—A distant 3p from A, then the region contains all points within a dis- 
tance p of x. 


66 HASSLER WHITNEY [January 


Lemna 1. Let w(z) be a continuous function of one variable defined throughout 
an interval containing 2, let A* be a closed set in this interval, and let wi be a 
fixed number. Suppose that for every «>0 there is a &>0 such that 
(1) if 2 is in A* and | z—20| <6, then 
| w(z) — w(zo)/(z — 20) — wi | <€; 
(2) if zis not in A* and | z—z0| <5, then the derivative w'(z) exists and 
| w'(z) — wo | <e. 
Then w(z) has a derivative at 2, and w'(z) = we . 
II. DIFFERENTIABLE EXTENSIONS 
6. The functions ¥;(x’; x). We shall make use of functions defined as 
follows for x in A and x’ in E (m finite): 


> 


(x’ — x)! (o, 


(6.1) ¥i(x'; x) = 
o1Sm—o, l! 

W(x’; 2) is the value at x’ of the polynomial of degree at most m—o, which 

approximates the function f;,(x) to the (m—o,)th order at x. Keeping x fixed, 

it is a polynomial in x’, given by Taylor’s formula in terms of its value and 

derivatives at x. In terms of these functions, (3.1) becomes 


(6.2) Sula’) = Wala’; x) + Ri(x’; x) (o, Sm). 
The /th derivative of the function of x’ ~,(x’; x) at x’ is Wis1(x’; x); if we 


express y,(x’’; x) by Taylor’s formula in terms of its value and derivatives 
y iay 


at x’, we obtain 
x"; x) 
x) = — x’)! 
l 


x’)! 


l! 


> (2? 9). 
J! 


The definition of ¥,(x’’; x’) in conjunction with this identity gives, for any 
points x and x’ in A and x”’ in E, 


l ! 


| 
i 
R Fe 
; ! 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 67 


7. The function O(x). Let R be the region given by the inequalities 
|an|<1 (h=1,---, m), let R’ be R minus the origin, and let R* be the 
boundary of R. Define the functions 0, 0’, © as follows: 


(7.1) O(x) = 211 — x?) ---(1— 2,2) —1in R’, 


6(x) 
(7.2) 0'(x) = in R’, 
1 — 
(2) in R’, 
0 in E—R. 


(7.3) Q(x) = { 


It is seen that —1<60(x) <+1, 0(x) as and 0(x)—>—1 as 
hence as and as x—R*. Consequently 
Q(x) © to infinite order as x0 and to infinite order as x—>R*; 
also @(x) is of class C® for x~0. If @’(x) =1/@(x) in R’ and @’(x) =0 for 
x=0, then 0’(x) is of class C® in R. 

8. The subdivision of E—A. Divide E into n-cubes of side 1, and let Ko 
be the set of all these cubes whose distances from A are at least 6n'/? (if 
there are any). In general, having constructed the cubes of K,_1, divide each 
cube which is now present but is not in Ko+ - - - +K,_ into 2” cubes of 
side 1/2", and let K, be the set of all these cubes whose distances from A 
are at least 6n'/?/2* (if there are any). 

The distance from any cube C of K, to A is <18m'/2/2* (s21); for it lies 
in a cube C’ of the previous subdivision which does not belong to K,-1, and 
whose distance from A is therefore <6n'/2/2:-", 

Any cube C of K, is separated from any cube C’ of K,42 by at least four 
cubes of K,4:. For the distance from C to A is 212n/?/2*+1, the distance 
from any point of C’ to A is <9n/?/2*+1, and the diameter of any cube of 
is n¥2/2'+1, 

9. The functions ¢,(x). We introduce the following definitions: 

y', y*, - - - is the set of all vertices of cubes of Ko+Ki+t - - - , arranged 
in a sequence. 

r, is the distance from y, to A (v=1, 2,---). 

x” is a fixed point of A whose distance from y’ is 7,. 

b, is the length of side of the largest cube of Ko+Kit+ --- with y asa 
vertex. 

I, is the set of points x for which | cxa—yn? | Sd, (h=1,---,m); B, is its 
boundary. 


HASSLER WHITNEY [January 


Xn Vn" 


b, 


) in E— y’; 


b, 
7,(x) 
Da 
1, = 7, 
0, (u 


in E—A,x#y', y*,---, 
$(x) = 


Suppose y* is a given point of E—A, distant 5+ from A (or from a given 
point 2° of A), and suppose 4* lies in the cube C of K,. Then if J,, with center 
y’, has points in common with C, and y’ is distant d, from A (or from 2°), 


(9.1) 54/2 S d, < 25g. 


To prove this, say C’ is a largest cube with y’ as a vertex, and C’ is in K;; 
then t2s—1. The diameter of C’ is m/*/2'; hence y”’ is distant at most n/2/2# 
<2n'/?/2* from any point of J,. As the diameter of C is n¥?/2*, y’ is distant 
it most 3/2/2* from y*. But 5+26mn'/2/2*, and the inequalities follow. 

Each function z,(x) is >0 in J,-—B,—y’ and only there; it approaches 
and 0 to infinite order as x approaches y’ and B, respectively. Each point x 
of E—A is interior to some cube J,, hence z,(x) >0 for some v, and }-m(x) >0 
in E—A, justifying the definition of ¢,(x). Note that ¢,(x) is 0 in J,—B, 
and only there; also 


(9.2) in E-A. 


We shall show that ¢,(x) is of class C* in E—A. This is obvious at points 
xy’. Consider a small neighborhood U, of A#v. (x) is of class C® in 
hence the same is true of ¢,=77,/(1+7/ in Uj. Similarly 
¢,=1/(1+7/>-,,m,) is of class C* in a small neighborhood U, of y’; the 
statement follows. 

10. The derivatives of the ¢,(x). Consider two (closed) cubes C and C’ of 
Ko+Kit ---,and let J and J’ be those sets J, with points in C and C’ re- 
spectively. We shall say C and C’ are of the same type if the sets in J’ can be 
brought into coincidence with the sets in J by a translation and stretching 
of the axes, that is, if the structure of the subdivision about C’ is the same 
as that about C. There are but a finite number, say d, of possible types of 
cubes, and for some number ¢, there are at most c sets J, with points in any 
given cube C. 


68 
b, 
| 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 69 


Take a fixed cube C of Ko and a fixed k. As each ¢,(x) is of class C®, 
D,,(x) is bounded in C; there are only a finite number of these functions ~0 
in C, and hence they are uniformly bounded: 


| Dio(x)| in C 


Consider now any cube C’ of any K,, and let C be a (perhaps hypothetical) 
cube of Ky of the same type asC’. If Kh, - - - , Jy are the sets J, with points 
in C’, let J,,, - - - , J), be the corresponding sets with points in C; the latter 
set of sets is carried into the former by a translation of the axes and a stretch- 
ing by a factor 1/2*. Each function ¢, corresponding to J,, goes thereby into 
the function 

= [ye + — yr’) ] 
corresponding to J,,,. Therefore, differentiating o, times with respect to z, 
= + — ye’) ] 
for x in C’, and hence 
| Dids(x)| < in C’ =1,2,---), 
as ¢,(x)=0 in C’ for y¥di/, - - - , Ad. Now the constants N;,(C) take on at 
most d distinct values for a fixed k; if we let N; be the largest of these, we can 


state: Given any n-fold set of numbers k, there is a number N;, such that if C is 
any cube of K., then 


(10.1) | < in C 

11. A differentiable extension of f(x), m finite. We are now in a position to 
prove, for m finite, 

Lema 2. Under the conditions of Theorem I, there is a function g(x) of class 
C* in E—A, having the properties (1) and (2) of Theorem I. 

For each v (v=1, 2, - - - ) there are functions ¢,(x) and ¥(x; x”) = Wo(x; x”); 
we put 
x”) in E— A, 


f(x) in A. 


(11.1) g(x) = 


As the ¢,(x) and (x; x’) are of class C* in E—A, the same is true of g(x). 
The function g(x) =f(x) is of class C™ at all inner points of A, by §3. It re- 
mains to show that D,g(x) exists, equals f:(x), and is continuous, at all 
boundary points of A, for o,<m. 


70 HASSLER WHITNEY [January 


Take a fixed boundary point x of A, and any e, 0<e<1. Take 
n < €/{2c[(m + 2)!]"(108m1/2)"V} and  < «/6, 


where N is the largest of the numbers N; for o,<m. Take M>|f;(x) | 
(o,m, xin A and rz» 31), and take 


5 < «/{6(m+1)"M} and <1 


so small that (3.2) holds at the point x° with e replaced by 7. Take now any 
point y* of E—A within a distance 5/4 of x°; we shall show that 


(11.2) | Dig(y*) — fr(x®) | < Sm). 


Say the distance from y* to A is 5«/4 (then 5« <4), and let x* be a point of 
A distant 5«/4 from y*. Consider the sum in (6.1) with x’ and x replaced by 
x* and x respectively; as each ], is <m, it contains at most (m+1)* terms. 
If we take the term with /,= - - - =/,=0 to the other side, there is in each 
remaining term a factor (x,* —2,°)'» with 1,>0. As each | | is <6<1, 
we find 


| 2°) — fe(x®)| < (m + 1)"MB < 
But also |Ri(x*; x) | <n <€/6; hence, using (6.2), 
| — < €/3. 
Similarly we see that |yx(y*; —f:(x*) | <€/6; therefore 
(11.3) | a*) — fulx®) | < €/2 (ox 


Say y* lies in the cube C of K,, and let h,, - - - , J), be those sets 4, with 
points in C. Each corresponding point *« is distant <6/2 from 2°, by (9.1), 
and hence each corresponding point x*« is distant <6 from x’. As the same is 
true of x*, (3.2) gives 


(11.4) | Re(x?; x*)| < ++, 
Set 
= x”) — (vy = Au, Ae); 


then as <5 and | | <5« for x in C, | and (6.3) and 
(11.4) give 


(11.5) | | < (m+ in C = 
Using (9.2), we see that 


(11.6) g(x) = + in C. 


a=] 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 


As Dif (x; x”) =yi(x; x”) and therefore Dif,.0(x) =£,;x(x), 


Dig(x) = x*) + > in C. 


s=1 


(10.1) and (11.5) give, as t<c (see §10) and 


= 
h 


(11.7) Deg(x) — Wa(x; x*)| << Soc[(m + in C. 
l 


Now the distance from C to A is >5«/6; also, as C is in K,, this distance is 
<18n2/2*, Hence or, This gives, as 
and 6*<1, 


(11.8) | Deg(x) — ve(x; x*) | < c[(m + 2)!]"(108n"/2) < €/2 


in C, and in particular, at y*. This inequality tages with (11.3) gives 
(11.2), as required. 

The proof can now be completed with the aid of Lemma 1. (11.2) with 
k=0 shows that g(x) is continuous throughout E. Take any number 
k=(hki, +--+, kn) with o.<m, and put k’=(k,---, ka Rn). As- 
suming that D,g(x) is continuous in E, we shall show that D,-g(x) exists and 
is continuous in E. Take any boundary point x°=(x,°, - - -, x,°) and put 
Zo= = = Dig(mi®, ---, we Let A* be 
the set of points of A for which x,= x,° (ph). (3.3) with x=2° and Ax, 
=2n—%,°, and (11.2) with & replaced by k’, show that the conditions of the 
lemma are fulfilled; hence @w(z0)/dx,= D,-g(x*) exists and equals f;-(z*). 
(11.2) shows that D,-g(x®) is continuous at x°. Therefore g(x) is of class C™ 
in E. 

12. A differentiable extension of f(x), m infinite. We now prove Lemma 2 
for the case m=. For any given m, let Wm;x(x’; x) (ox Sm) be the function 
given by the right hand side of (6.1). Choose the axes so that the origin falls 
on a point of A. Let S, be the set of all points of E whose distances from the 
origin are <2?, p=1, 2,---. Let M, be the maximum of | f,(x) | for on <p 
and xin A-S,, and let NV“ be the maximum of WN; for ¢; S$ p. Choose for each 
positive integer p a number 6, such that 


5p < + My. bp < 


The extension g(x) of f(x) is determined as follows. Given any number 
v, determine the number y, so that 6,,,:<7,<6,, (see §9); set y,=0 if r,>61. 


72 HASSLER WHITNEY - 


Put 


x’) in E-—A, 
(12.1) = 


(x) in A. 


Given any fixed k, we shall find an inequality similar to (11.2) for 
Dig (x). Let g‘™(x) be the extension of f(x) of class C™ given by Lemma 2 
(m=1, 2,---). Given any boundary point x° of A and any e>0, choose 
p2o.+2 so that x° lies in S, and so that 1/2?<e. Take 5<5, so that (11.2) 
with g replaced by g» will hold for our given k and any y* of E—A within 
5 of x°; we show next that for any such y%*, 


(12.2) | Dig(y*) — Dig*®(y*)| <e. 


Choose g so that 6,4: 5+<6,, where 5+ is the distance from y* to A; then 
Define C, K., as in §11. Note that for y=any Aa, 
Sr,<26<25,<6,1, hence y,+1>-—1, and thus y,>p—220,. Set 


(12.3) = ;0(%; 2") x”) » Az); 
using (12.1) and (11.1), we see that 
(12.4) gx) = gew(x) + in C. 


u=1 
Now, \D,£,(x) 2”) 2”). If we replace by 7 in (6.1), then 


those and only those terms in the sum with ¢,Sm-—<, occur. Replacing m 
by 7, and o; successively and subtracting, we have 


(12.5) D(x) = (x — in C. 
l! 
Now 1,>5+/2, by}(9.1), hence 7, >5,42, and thus y,Sqg+1 - ++, As); 
there are therefore less than (g+2)" terms in the sum, and in each term, 
It follows that | f;4:(x”) |<M 4: in each term. Also | | 
<25«<26, and ¢,20,—0,+1 in each term; hence 


| D§.(x) | < (q + 9412915, 
in C. This with (12.4) gives 


t k 
| Dig(x) — Dig(x)| > 
u=1 7 


k 
j Com) (q + 2)"M g 
i 


[January 
| 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 73 


in C. Now the distance from C to A is >ds/2 and is <18n/2/2*; hence 
2* <36n"/?/5». Also <q; therefore 
| Deg@(x) — Dag (x) | < c[(q + 2)!]"(36n"/?) M 


<1/2*<e 


in C, and in particular, at y*, proving (12.2). Using (11.2), we find 
| Dig@(y*) —fi(x°) |<2e for any point y* of E—A within 6 of x°. Again we 
can apply Lemma 1 and show that D,g(x) exists and is continuous through- 
out £. As this is true for every k, the proof is complete. 

13. We prove next a combined extension and approximation theorem. 

Lema 3. Let f(x) be of class C™ (m finite) in E, with Di f(x) =fi(x) (ox Sm) 
there, and let fi(x) (m<o,<m’', m'>m finite or infinite) be defined in the 
closed set A so that f(x) (considered now only in A) is of class C™ there. Then 
for an arbitrary «>0 there is a function g(x) which is of class C™ in E, of class 
C’ in a neighborhood of A, and equals f(x) outside another neighborhood of A, 
such that 


(13.1) Dig(x) = fi(x) in A (o, Sm’), 
and 
(13.2) | Deg(x) — Dif(x)| <€ in E (o, Sm). 


Let f’(x) be the extension of class C™ of the values of f(x) in A given by 

the last lemma, and put {(x) =f’(«) —f(x); then {(x) is of class C™ in E, and 

D.f(x) = 0 in A Sm). 

Set n= (N=max for o,Sm). As {(x) is of 

class C™ and D,{(x) vanishes in A (o,<m), we can find an open set R con- 
taining A so that if y is any point of R—A, at a distance 6 from A, then 

| Dis (y) | < (o, Sm). 


Let », 2, - - - be those numbers such that J,, lies wholly in R (p=1, 2, - - -). 
We set 


(13.3) g(x) = f(x) + ¢(x) > ¢,,(x) in E — A, 


and g(x) =f(x) in A. As >°¢,,(x) =1 in an open set surrounding A, g(x) =f’ (x) 
there. As >°¢,,(x) =0 in E—R, g(x) =f(x) there. The statements about the 
class of g(x) are true. To show that (13.2) holds, let y be a point of R—A, 
distant 6 from A; then, defining C, K., J,,, - - - , J,,a8 in the previous lemma, 
we have 


HASSLER WHITNEY 


k 
| Dig(y) — Di f(y) | | | | 


14. We close this section with a theorem concerning the isolated points 
of A. Define a, as follows: 


m if m is finite 
(14.1) a, = { 


p if m is infinite (p =1,2,---). 


Lemma 4. Consider the closed set A=A'+a,+a2.+ ---, where a, 
are isolated points (then A’ is closed), and let m be finite or infinite. Let f,(x) be 
defined in A’ for o,<m and at each a, for all k, so that f(x) is of class C™ in A 
in terms of the f(x) (ox <m). Then there is a function g’(x) of class C* in E—A’ 
and of class C™ in E, such that 


(14.2) Dig'(x) = fi(x) in A’ for o, S mand at each a, for all k. 


Let g(x) be the extension of f(x) of class C™ given by Lemma 2. Let 
U,, Us, - -~- be neighborhoods of a, a2, - - - , chosen so that each is at a 
positive distance from each other and from A’. If m is finite, we alter g(x) 
in U;, next in U2, etc., by means of the last lemmaf, so that the new function 
g’(x) will take on the required derivatives at a:+0a2+ - - - , and so that 


(14.3) | Deg’(x) — Dig(x)| <1/pin Up, (or S ap, p= 1,2,---). 


(14.2) is an immediate consequence of this inequality and Lemma 1. 


III. APPROXIMATION THEOREMS 


15. We prove first the following extension of the Weierstrass approxima- 
tion theorem.{ 


Lemma 5. Let g(x) be of class C™ in E (m finite), and let S be a bounded 


closed set in E.§ Then for each e>O there exists a function G(x) analytic in E 
and such that 


(15.1) | DiG(x) — Dug(x)| <€ in S (o, Sm). 
Let R, be the set of points distant at most } from the origin (620). Con- 
sider the n-tuple integral 


t We use the last lemma with A replaced by a, and m’ by . 

t Compare de la Vallée Poussin, Cours d’ Analyse, vol. I, 2d edition, 1912, pp. 126-137. 

§ It is sufficient that g(x) be defined over S, for we can then extend its definition over E, by 
Lemma 2. 


74 [January 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 


(15.2) = rf dy = rf eee f dyn, 


where T is chosen so that &(#) =1; then 0S #(d) <1 for all b. If we replace 
y by xy and b by xb, we see that 


(15.3) = f dy. 
Rp 


Let v(x) be a function =1 in S, =0 outside some neighborhood of S, and of 
class C” in E, such that D,»(x) =0 in S for all &. (Such a function may be 
found for instance by the aid of Lemma 2.) Put g’(x) =0(x)g(x), and 


(15.4) G(x) = Te dy, 


where x will be chosen later; G(x) is analytic in E. As 1. is a function of y—x 
alone, differentiating under the integral sign gives 


D,G(x) = Tet = (- 


where D,‘* and D,“ denote differentiation with respect to x and y respec- 
tively. Integrating by parts o;, times gives 


(15.5) D,G(x) = Dag! 

As ®(c) =1, we see that 

(15.6) — Dig’(x) = Tx” f [Dig’(y) — Dag’(x) 


Take M so large that 
(15.7) | Dig’(x)| M in E (ox Sm). 


The functions D,g’(x) are uniformly continuous in EZ; hence there is a 6>0 
such that 


(15.8) | Dig’(y) — Dig’(x)| < €/2 (rey < 5, Sm). 
Take «x so large that 
(15.9) 1 — < «/(4M). 


For a given x, let U consist of all points within 6 of x; then if J; and J2 are 
formed by replacing the domain of integration on the right hand side of 


76 HASSLER WHITNEY 


(15.6) by U and E—U respectively, we have, using (15.3), 


€ 


|J2| < dy = 2M[1 — ] 
E-U 2 


and hence | D,G(x) — Dig’(x) in E(ox<m), which gives (15.1). 
G(x) may of course be replaced by a polynomial if desired. 
16. The above lemma can be generalized as follows. 


Lemma 6. Let R be an open set and let Ri, Ro, - - - be bounded open sets 
(some of which may be void) whose sum is R, such that each R,=R, plus bound- 
ary is in Ryi1. Then if g(x) is defined and of class C™ (m finite or infinite) in 
R, and 4262 - - - are given positive numbers, there is an analytic function 
G(x) defined in R such that 
(16.1) | DeG(x) — Dig(x)|<ep im R—R, (cx Say, p =1,2,---). 

a, is defined in (14.1). Note that, if Ri, - - - , R, are void, then 
(16.2) | DeG(x) — Dig(x)| <q in R (ox S a). 


Consider the closed set Rpi+(Rpii1— Ry) +(E—Rpi2) +0,4+07’; 
if in Lemma 2 we replace A by this set and f(x) by a function =1 in Q, and 
=0 in Q/ +Q/’, we find a function u,(x) for each p, of class C® in Z, such 
that 


1 in Q,, 
0inQy +Q,’; 
(If Rpy: is void, we put u,(x) =0; if R,4: is not void but R,_1 is void, we have 
u,(x) =0 in Qf’ and =1 in R,,:.) Let Z,21 be such a number that 
(16.4) | Diuy(x)|<ZpinE =1,2,---). 


We define successively analytic functions Gi(x), G2(x), - - - , by the fol- 
lowing formula: 


Dyu,(x) = +0,+ 07’ (or. > 0). 


(16.3) p(x) = 


(16.5) G,(x) = Tx f uy(y) [g(y) — {Gily) + +Gp-1(y)} dy. 


(For p=1, the factor in brackets is simply g(y).) xp is chosen so that, if we set 
(16.6) H,(x) = u,(x) [g(x) {G,(x) ees +G,-1(x)} 
then 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 
(16.7) | DiG,(x) — p(x) | < Bg = + 
in s p41) 


(see Lemma 5); we shall restrict x, further later. Remembering the definition 
of u,(x), we see that (16.7) with (16.6) gives 


(16.8) | Deg(x) — De {Gi(x) + +Gp(x)} | < By < €p/2inQp(oe 


Differentiating H,(x) and using (16.4) and (16.8) with p replaced by p—1, 
we see that (compare the derivation of (11.7)) 


| (x) | < + = €,/2?t) in Qp-1 (ox S ay). 


As u,(x) and its derivatives are 0 in R,_1, this holds in R,_: also; hence, using 
(16.7), we have 


(16.9) | DiG,(x) | < €,/2? in R, (ox S a»). 
We set now 
(16.10) G(x) = + Go(x) +--+; 


this is the desired approximation to g(x). To prove this, we see first from 
(16.9) that D,[Gi(x)+ - - - +G,(x)] converges uniformly in any bounded 
closed subset of R (om); hence G(x) is defined in R, and 


(16.11) DiG(x) = DiGi(x) + DGo(x) +--+ inR Sm). 
Next (16.9) shows that 
| + DiGpy2(x) + | < + + 

S €,(1/2? + 1/22 +--+) = €,/2 in Roy: S 


this with (16.8) gives | D.G(x) —Dig(x) | in Say), proving (16.1). 

It remains to be shown that G(x) is analytic in R. To this end we extend 
the definition of each G,(x) to complex values of x=(x/+ix{’,---, 
%. +ix,/'), using (16.5) still. Consider the analytic function of x 


= = — ak) + — xk’) 
as ys’ =0 in (16.5), the domain of integration being real, 
= — xk)? — ]. 


(16.12) 


Take any point x° of R and let U be the complex region of radius p about 2°, 
where p is so small that the real points in the complex region of radius 3p 
about lie in some R,; we take so that 3p? Nowif p>q, xisin U, and 
y isin R—R,_1, then <p? and (ys )?24p?, and hence 


Rr 2?) > 3p?. 


78 HASSLER WHITNEY [January 


Also H,(y) vanishes in R, and in E—R,42 for p>q; therefore if M, is the 
maximum of | H,(y)| (note that H,(y) is determined before we determine 
ky) and V, is the volume of R, (p=1, 2,---), 


| + ix”)| < Tk? f Mf 
Ry 


(16.13) 


for x in U and p>gq. Hence if we choose x, successively for p=1, 2,---,s0 
that this quantity is <1/2? (and so that (16.7) holds), then the series in 
(16.10), when defined for complex values of x, converges uniformly in a com- 
plex neighborhood of any point of R. Therefore the function G(x) is analytic 
in R, completing the proof. 

17. The numbers x, as chosen above depend not only on the functions 
u,(x) but also on the function g(x). Under certain restrictions, we can take 
them independent of g(x), as follows. 


Lema 7. Let the open sets Ri, Ro,---+ , the numbers &, €,--- , and the 
functions u,(x), ue(x),- be given as in Lemma 6; let Ai(r), A2(r), -- be 
a sequence of positive continuous functions defined for r>0, such that A,(r)—-0 
as r—0 and Ay,:(r) 2A, (1); let a be a point of R, and M a positive number. 
Then there is a sequence of numbers ki, k2,--- , with the following property. 
If g(x) is any function of class C™ defined in R such that | g(a) |<M and 


(17.1) | Deg(x’) — Dug(x)| im Rp S ap, p = 1,2,---), 
and if G(x) is defined in terms of g(x) as in the previous lemma, using the above 
numbers ky, then G(x) is analytic in R and (16.1) holds. 

As the u’s and their derivatives are uniformly continuous in E, there are 
functions ',(x) of the same sort as the A’s above such that 
(17.2) | Dytty(x’) — in E (0% S = 1,2,---). 
The conditions on g(x) imply that for some Mi’, | Dig(x)|<My" in R; 
Say 

| (ok S = 1,2,---). 
Then as #(x) =0 in R—Rs, we have 
| Djus(x’)Dig(x’) — Djmy(x)Dig(x) | S + As(ree’) 


t If ds is the diameter of Rs, then | g(x)|<M+Ai(d;) in Rs. Now take any k’=(hi,+-+ , ka—1, 
kn) and kn)(O<cxSa2). Let x’x” be a line segment parallel to the x,-axis and 
lying wholly in R;; set r= | an” —x,'| . As | Dy'g(x’")—Dy'g(x’)|<Ax(r), the law of the mean gives, for 
some point x* of x’x”, | Di(x*)| <Ai(r)/r. Hence Di(x)| <Ax(r)/r+A2(ds) in Rs 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 


for ¢, Sa, and o;Sa: and any x and 2’ in E. Hence if we put 
A#(r) = + Mi’ + 
we shall have, on differentiating Hi(x) =u(x)g(x), 
(17.3) | — DeHi(x)| S in E (ox az). 
Also | DiHi(x) | < [(a2+1)!]°Z/ Mi’ in R; and =0 in E—Rs (0, Sa:); thus 


inequalities corresponding to (15.7) and (15.8) hold for Hi(x). Hence if we 
take 6:>0 so that 


Af (r) < By /2 (r < 51), 
and take x; so that 
1 — < BY /{4[(ae + 1)!]"Z/ Mi’ }, 


then if we form G,(x) for any admissible g(x) by means of (16.5), (16.7) will 
hold with p =1; we take x, large enough so that the right hand side of (16.13) 
with p=1 will be <1/2. 

If we differentiate (16.5) with p=1 o;, times (0, m), we derive an equa- 
tion similar to (15.5); forming this for x=x and x=’ and subtracting, we 
find (changing y to y+2’—< in one equation) 


(17.4) DiGi(x’) — DiGi(x) = Te? f [DiHi(y + x’ — x) — 


This with (17.3), (15.3), and the definition of 6() gives 
(17.5) | DiGi(x") — DiGi(x)| A*(r22) in E (ox S a). 


Assume now we have defined functions A,*(r) and have chosen numbers 
Kp so that 


(17.6) | DiG,(x") — DiG,(x)| S in E (ox S 


so that (16.7) holds, and so that the quantity in (16.13) is <1/2, for p<q. 
Then for any admissible g(x), g(x) —{Gi(x)+ - - - +G,u(x)} satisfies the 
same kind of conditions as g(x); hence, just as before, we find a function 
A,*(r) so that an inequality similar to (17.3) holds for D.H,(x) in E (0% Saq41). 
Also H,(x) is bounded properly; hence we can choose «, so that (16.7) holds 
for any admissible g(x) with p replaced by g, and so that (16.13) with p=q is 
<1/2¢. From this we show, as before, that (17.6) holds with p replaced by g. 
We can thus continue finding functions A,*(r) and numbers x, indefinitely. 
We put finally G(x) =G,(x)+G.(x)+ - --, and show, just as in Lemma 6, 
that G(x) has the required properties. This ends the proof. 


HASSLER WHITNEY [January 


IV. ANALYTIC EXTENSIONS 


18. Proof of Theorem I. Let g(x) be the extension of f(x) of class C™ 
given by Lemma 2. Set R= E—A and define Ri, Ro, - - - , a1, a2, -- +, and 
numbers &, €, - «+ , approaching zero as in §16. Define G(x) in E—A as in 
Lemma 6, and set F(x) =G(x) in E—A, F(x) =f(x) in A. That F(x) is of 
class C” in £ and property (2) holds follows from (16.1) and Lemma 1, just 
as in §11; the other facts are obvious. 

19. The functions w,,(x). In the next sections we shall discuss the an- 

alyticity of the extension of f(x) at the isolated points of A. Let R be an open 
set, let a:, a2, - - - be points of R having no limit point in R, let m, ma, - - - 
be corresponding integers 20, and let m be an integer =>—1 or ©. We as- 
sume that if a,,, a,,, - - - isamy sequence of points a, approaching the bound- 
ary of R, then 
(19.1) lim inf m,, = m. 
Choose about each a, a neighborhood U, lying, with its boundary, in R, 
so that no two have common points. Define the numbers p(v; k) so that when 
(v; k) runs through the values (1; &), Sm; (2; k), ox etc.; then p(y; 
runs through the values 1, 2, 3, - - - . Let p’(v; &) equal one plus the largest 
of the numbers m,~* - - , m,, p(v; R). 

Take any positive integer s, and consider all neighborhoods U, such that 
p'(vk) =s for some k (0, <m,); let R, be the set of all points of R whose dis- 
tances from these neighborhoods and from the boundary of R are >1/s, and 
whose distances from the origin are <s. Then R, is a bounded open set, R, 
lies in (S=1, 2,---), RitRet+--- =R, and U, lies in R—R, ow 
(o.<m,). By Lemma 2, there are functions w,;(x) of class C® in E, defined 
for v=1, 2,- , such that 


S m,); w(x) = Oin E — U,. 


(19.2) = 


Choose for each v a positive number 6, <1/v so that 8,=8,4:, and 


B,| Duore(x) | < 1/v[(m, + 1)"] in E 


(19.3) 
(ox S m,, 01 S m,,v = 1,2,---). 


Now let f,, be any set of numbers, defined for o, Sm,,v=1, 2, - - - , satis- 
fying the condition 
(19.4) | S Bp Sm,,v = 1,2,---). 
Set 


80 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 81 


(19.5) Corky = foky = (ox S m,,v =1,2,---). 
Take any s=p(vk). As w, (x) =w,.(x) =0 in R—U,, and R,=R,o» is in 

R,-~» Which has no points in common with U,, 

(19.6) w/ (x) = Oin R, (s=1,2,---). 


20. The transformation Z. Define functions u(x), ue(x),--- as in 
Lemma 6. Consider any function 


(20.1) g(x) = (x) + (x) +--- (|X| S 1,5 =1,2,---); 
such functions and X’s we shall call admissible. Set 

(20.2) = 

There are, obviously, functions A,(r), A.(r), - - - , so that (17.1) holds for any 
such g(x); hence, by Lemma 7, we can define numbers xj, ke, - - - , so that if 
G(x) is defined in terms of g(x) as in Lemma 6, then G(x) is analytic in R and 
(16.1) holds. Im using Lemma 6, we replace ay by p. 

We note here a certain property of G(x): If g(x) is admissible and 

(20.3) if g (x) = Oin R,, then | DiG(x)| (on S s — 1). 
As u,(x) =0 in R—R, (pSs—2), u,p(x)g(x) =0 in E (pSs—2). Using (16.5), 


we see in succession that G,(x) =0, - - - , G,2(x) =0. This with (16.9) and 
(16.11) gives 


p=s—1 p=s—1 


in (ox as required. 
Given any admissible g(x), let Lg(x) be the corresponding function G(x). 
It follows easily from the definition of G(x) that L is linear: 


(20.4) L[higi(x) + = ALgi(x) + AcLgo(x). 


We show now that for admissible numbers X, 


(20.5) L > Aw, (x) = (x). 


To prove this, take any point x° of R, in the set R,, and any e>0. Take 
q' =q so that 1/2’-*<e. (19.6) and (20.3), for s=q’+1, g’+2, - - - , give, as 
(x) is admissible, 


| > (x) | = | > (x)| < 1/2*-? = 1/20’-? < ¢/2 


em q’+1 s=q’+1 e=q’+1 


82 HASSLER WHITNEY 
in R,,, and in particular, at x°. As 
dD Awl (x) = Oin 
s=q’+1 
and is admissible, 
|L (x°)| < 1/201 < €/2. 
Moreover 


LE (2) = (2) +L (2); 


a=q’+1 


ILE — (x9) | 


<|L ews (x9) | + | (x*) | < 


q’+1 s=q’+1 


which proves (20.5). 
We prove two inequalities. Take any (v; k), (u; 2) (ox Sm,, o1.S5m,); then 


(20.6) | D,Lw,i(a,) | < 
(20.7) | DiLeyi(a,) — < 
The first follows from (16.1) when we note that a, is in R—R,-~», and 
<€x), ANd Sm,<p’(vk) (recall that a, was replaced by # in using 
Lemma 6). We now prove the second. As w,:(x)=0 in R,wn and p’(ul) 
= p(ul) +1, (20.3) gives 
(a) | — < in S p’(ul) — 1). 
Also (16.1) gives 
(b) | — Deour(x)| < in R— Rp (ox S p, P= p'(ul) — 1). 
Say a, is in Rpyi—R,. As a, is not in p’(vk) Sp, and og Sm,Sp' (vk) 
—1<p-—1. If p=p’(ul) —1, (20.7) follows directly from (b). If p<p’(ul) —1, 
then a, is in R,-qwn-1, and o,.Sp—1<p’(ul) —1, and (a) applies. 

21. An infinite system of linear equations. We prove here 

Lemma 8. Suppose n, and c, (s=1,2,---), and Yu (s, t=1,2,---), are 
given, so that (s=1, 2,---), \c.| $1, and 


(21.1) | Yee | < (s,#=1,2,---). 


out 
hence 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 


Then there are numbers d, (s=1, 2, - - - ) such that 


(21.2) Diver + 5st) = 
t=] te] 

and 
(21.3) |r. — 

Using the method of successive approximations, put 
(21.4) Mis = Coy Ape = — (p = 2, 3,-- 

t=1 

It is readily proved by induction that 
(21.5) | Ape | < (p = 2, 
Hence the series \is+A2.+ - - - converges to a limit A, (s=1, 2,---), and 


+ > + 5st) Ape = = Cs, 


p=1 


|r. — =| > = m. 
p=2 p=2 


22. We are now ready to prove 


Lemma 9. Let R, m, a», m,(v=1, 2, - - - ) be defined as in §19. Then there 
are numbers B,>0 (v=1, 2, - - - ) with the following property. Given any set of 
numbers f,, defined for o,<m,,v=1, 2, - - - , such that (19.4) holds, there exists 
a function G(x) analytic in R, such that 


(22.1) D,G(a,) = Sur My, 1, 2, ), 
and such that if we set G(x) =0 in E—R, then G(x) is of class C™ in E, and 
(22.2) DG(x)=OimE—R (o, Sm). 


We define the w,:(x) and the 8, as in §19. Now take any f,, satisfying 
(19.4), and define the ¢,,.x) by (19.5). Define the e, and the transformation L 
as in §20. Set 


(22.3) Nowwk) = B, (ox Sm,,v= 1, 2,°°° 
and 
(22.4) Yer = = — = Dy — Dywyi(ay). 


Let u=p(6/) be the larger of the two numbers s = p(vk), ¢=p(ul). Then us- 
ing (20.6) or (20.7) according as u=s or u=t, we find (as B,@p SB,~» SB,) 


83 
p=1 


HASSLER WHITNEY [January 


| Vet | < s 


Also | c.|=|f,.|<8,<1. Therefore the equations (21.2) have a solution 
A, de, and 


(22.5) + = = Cs = = 


t=1 


By (22.8) below, the \’s are admissible (§20), and we can define the analytic 
function G(x) in R by the equation 


(22.6) G(x) =L (x). 


t=1 


(20.5) and (22.5) give 


(22.7) DiG(ar) = De (ar) = = 
t=1 pl 
(19.6) and (20.3) show that the last sum above is uniformly convergent in any 
R,; hence the termwise differentiation is permissible. 
Set G(x) =0 in E—R; we must show that G(x) is of class C™ in E. (This is 
trivial if m= —1.) First note that, by (19.4) and (21.3), 


(22.8) | | | Cocky | + = | for | + B, 
this with (19.3) gives (replacing v, k and / by yp, / and ) 
| | < 2/ [u(m, + 1)"] in E 


(ox, 01 S = 


(22.9) 


Now take any boundary point x° of R, any integer m’<m, and any e>0. 
Take g2m’ so that e,<¢/2. Take 6>0 so that R, has no points within 6 of 
x°, and so that if y is any number such that U, has points within 6 of x°, then 
m,=m’' and 2/v<é«/2 (see (19.1)). Consider any point y of R within 6 of x®, 
and take any k, o, $m’. Either D,w,:(y) =0 for all y, J, or else for some p, y 
lies in U,, in which case there are at most (m,+1)” such numbers +0, and 
2/u<e/2, and m,2=m’. Thus if we replace x by y in (22.9) and sum over yu 
and /, we find 


| Di (y) | = | | < €/2 (cx Sm’). 


t=1 pl 


As y is in R—R,, replacing a, by g2m’ in (16.1) gives 


| DiL (y) — De (y) | < < (ox Sm’). 


te=1 


84 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 


This with the last inequality gives 
| DiG(y)| < ein Rifre, <5 (ox Sm’); 


the proof is completed with the aid of Lemma 1. 
23. Functions analytic at the isolated points of A. Lemmas 4, 6 and 9 lead 
directly to the following theorem. 


THEOREM II. Let A be a closed set in E, and let ay, a2, - be isolated points 
of A. Set A'=A—(aita.+ ---). Let m be an integer =—1 or ~, and let the 
integers m,=0, v=1, 2,---, satisfy (19.1). Let fi(x) be defined for x in 
A'(o,.<m), and for x=a, (o4<m,), so that f(x) is of class C™ in A. Then there 
is a function F(x) of class C™ in E such that 

(1) F(x) =f(x) in A, 

(2) DiF(x)=fi(x) in A’ for ox <m and at each a, for ox <m,, 

(3) F(x) is analytic in E—A’. 


We asked that f(x) be of class C™ in A, while f,(a,) may not be defined 
for certain values of vy and k (o,<m). We require merely that after setting 
fi(a,) =0 (0, >m,), f(x) shall be of class C in A. 

A special case of interest is m= —1. The m, and the f,(a,) are then unre- 
stricted. A’ may be void, in which case f(x) is analytic throughout EZ. A’ may 
of course contain isolated points. 

To prove the theorem, set R= E—A’ and determine the open sets R, and 
the numbers £, (v=1, 2, - - - ), as in §19. Let g’(x) be the extension of f(x) 
of class C™ in E and of class C* in E—A’ given by Lemma 4 (setting f;,(a,) =0 
for 7, >m,). Let G’(x) be the analytic function in R given by Lemma 6 (with 
a, replaced by #) such that 


(23.1) | DiG’(x) — Dig’(x)| < Bp in R — Ry (cz S P), 
and set G’(x) =f(x) in A’. G’(x) is of class C™ in E, and 

(23.2) (x) = fi(x) in A’ S m) 
(see §18). Set 

(23.3) fox = Dig'(av) — DiG'(ay) (og m,,v = 1,2,---). 


As a, lies in and p’(vk) >m,, (23.1) gives | fix | <B(ox Sm). 
Thus the conditions of Lemma 9 are satisfied, and there is a function G(x) 
analytic in R, =0 in A’, of class C™ in E, and such that (22.1) and (22.2) 
hold. Set 


(23.4) F(x) = G(x) + G(x); 


86 HASSLER WHITNEY - [January 


then F(x) is our required function. It is of class C™ in E as the same is true 
of G’(x) and G(x); it is analytic in R= Z—A’ as the same is true of G’(x) and 
G(x); it equals f(x) in A’ as G’(x) =f(x) and G(x) =0 there. (22.1), (23.3) 
and (14.2) show that D,F(a,) = Dig’(a,) =fi(a,) (ox Sm,); (22.2) and (23.2) 
show that D,F(x) =f;(x) in A’, completing the proof. 

24. An extension-approximation theorem. We prove here 


THEOREM III.} Let A be closed, and let A_;, Ao, A1, - - - be closed subsets of 
A such that each A, lies in A441. Let ds, Oe2, - - - be points of A,—A,1 which 
are isolated points of A, and set A'=A—) ay. Let B_, be void, and let Bo, 
B,, - - - be sets whose sum B lies in E—A, such that each B, lies in Bys,, such 
that each set B,—B,_1 has limit points in B—B,1+A, only, and such that each 
set A+B—B, is closed. Let be defined in each set T, =A+B—(A,1+B,_1) 
for o.<s (s=0, 1, +--+) so that f(x) =fo(x) is of class C* in T, in terms of the 
fi(x) for each s. Let (x) be a continuous function, positive in E—A’ and zero in 
A’. Then there is a function F(x) defined in E—A_, such that 


(1) F(x) is of class in (s=0,1,---), 

(2) D, F(x) =fi(x) in A—A, 1 (o. 35, s=0, 1, ), 

(3) | DiF(x) —fi(x) | <e(x) in Ss, s=0, ); 
(4) F(x) is analytic in E—A’. 


Any number of sets A,, B, may be void; any of the points a,, may not 
exist. Note that if A, =A —(A_1+Aoc+t - - -), then F(x) is of class C® at all 
points of A,. Theorem I for m finite is obtained by letting B and A_,---, 
A ma be void, and setting A = A,,; and for m infinite, by letting B and every A, 
be void. Theorem II is obtained similarly; we arrange the a,, in a sequence 
d2,- - , and set m,=s if a, is in A,—A,1. Lemma 6 is obtained by setting 
A=A_,=E-R, (s=0, 1, - ++), and taking ¢(x) so that e(x)S«, 
in R—R,. 

We turn now to the proof. Take a subdivision of the open set E—(A +B) 
as in §8, let y” (v=1, 2, - - - ) be the vertices of the cubes, and let x” be a 
point of I’) whose distance from y” is not more than twice the distance from 
y” to I'o. Define the functions ¢o,(x) in E—(A +B) as in §9, and define gq (x) 
by (11.1), using the functions ¢o,(x) and Yo;0 (x; x”) =f(x”), and replacing 
E-—A and A by E-—(A+B) and I> respectively. Then g¢ (x) is defined 
throughout E—A_,, and is easily seen to be a continuous extension of f(z). 
Let go(x) be a function =g¢ (x) in A—A_,+T; and analytic in the open set 
E-—(A-+TI)) so that 


t Aspecial case of this theorem has been proved by A. Besikowitsch, Uber analytische Funktionen 
mit vorgeschriebenen Werten ihrer Ableitungen, Mathematische Zeitschrift, vol. 21 (1924), pp. 111-118. 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 87 


(24.1) | go(x) — gd (x)| < 0:(x)/4 in E — (A +1)), 


where 6,(x) =min [e(x), distance from x to A+I,] (p=1, 2,---). Then 
£o(x) is continuous in E—A_,. 

We shall now define in succession functions g.(x), ge(x), - - - , with the 
following properties: 

(a) gp(x) is defined in E—A_i, is of class Ct in E—A,_; (s=0,---, 9), 
and is analytic in E—(A+T p41). 

(b) Digp(x) =fi(x) in pu: (ox S5, 5=0,---, p). 

(c) | Digy(x) — Dig p(x) | <e(x)/2°+? in Be (o.Ss, s=0, 
p—1). 

(d) | Digy(x) —fi(x)| <e(x)/2+? in (04S). 


Assuming go(x), - - - , gp-1(”) are defined, we shall define g,(x). Consider 
any point of I’,; it is at a positive distance from the closed set Ap1, and hence 
we can enclose it in an open set lying at a positive distance from A,1. We 
thus enclose I’, in an open set ry containing no points of A,-1, and having no 
limit points in A,_; other than limit points of I',. Take a subdivision of the 
open set E—(A+T,), let y” (v=1, 2, - - - ) be the vertices of the cubes, and 
let x”” be a point of I’, whose distance from y” is not more than twice the 
distance from y” to I, (v=1, 2,---). Define the functions ¢,,(x) in 
E—(A+T,) as in §9 and define x) by (6.1) (6. <p), replacing m by p. 
Remembering that f(x) is of class C? in T,, set 


(24.2) ge (x) = —T,, 


and set gy (x) =g,-:(x) in I’,. From the proof in §11 it is seen that g, (x) is an 
extension of class C? of the values of f(x) in T,. 
Set =g/ (x) —gp_s(x) in ; then £,(x) is of class C?-! in TJ, and 


(24.3) Dit = OinT, (on Sp —1). 


Set (p=0, 1,---), where N®= 
max NV, for o,<p. Let K,_: be the set of points of B,_, for which 


| Dig p(x) | for some p — 1), 


where 56, is the distance from x to I’,, or 1 if that is smaller. Each point of 
A—A,1 is at a positive distance from K,1, as Bp, has no limit points in 
A—A,1, and each point of B—B,_, is at a positive distance from Ky: on 
account of (24.3), as «(x) >0 in B; hence each point of I, is at a positive 
distance from K,1, and we can enclose I’, in an open set IZ’ which lies in 
Ty and contains no points of K,1. Now 


88 HASSLER WHITNEY ~ [January 


(24.4) | p(x) | < in Tf’ (ox p—1). 
We can also take Ty’ so that if p,(x) is the distance from x to A,_1, then 
(24.5) | Dit p(x) | < in TY’ (o. p—1). 


Let J,, be those sets J, of the subdivision of EK—(A+TI,) lying wholly in 
Ty’ (¢=1, 2,---), and set 


(24.6) 8’ (x) = gp-1(x) + in E— (A 


and gy’ (x) =g,-1(x) =f(x) in A—A_14+T,. Then g/’(x) is of class in 
E-—A,., and with the help of (24.4) we find 


(24.7) | Deg (x) — Degp—s(x) | < €(x)/2°* in By; (ox S p — 1) 
(see §13). Also (24.5) gives 
(24.8) | Digs’ (x) — Dagy-s(x)| < pp(x)/27% in E-A S p— 1), 


and hence g,’ (x) is of class Cin E—A,_; (s=0, - - - , p), as the same is true 
of (s=0, p-1). 
Finally let g,(x) be an analytic function such that 


(24.9) | Dagy(x) — Degp (x) | < Op41(x)/2°* in E — (A+T p41) S 0); 


set gp(x) =gpa(x) =f(x) in A—A_1+T' p41. Then g,(x) has all the required 
properties. (c) is a direct consequence of the above inequality and (24.7); 
(d) follows from (24.9) and the fact that Dig,’ (x) =f.(x) in <p); (a) 
and (b) follow with the aid of Lemma 1. 

Set 


(24.10) g(x) = lim g,(x) in E — Ay. 


By (24.8) and (24.9), g(x) exists and is of class C* in E—A. Let x be any 
point of any A,—A,_1; by (a), g,(x) is of class C* in a neighborhood of x for 
pz2s, and by (24.8) and (24.9), the same is true of g(x). The same argument, 
using (b), shows that 


(24.11) Dig (x) = in A — (o% 5,s= 0, 1, 
Finally (c), (d), (24.1) and the definition of g¢ (x) show that 
(24.12) | Dag(x) — fi(x)| < e(x)/2in B— Buy (04 S 5,5 =0,1,---). 


We have now found an extension g(x) with all the properties but (4). It 
is replaced by an analytic extension F(x) just as in §23; we must be careful 
merely to make 


1934] EXTENSIONS OF DIFFERENTIABLE FUNCTIONS 89 


(24.13) | DiF(x) — Dug(x)| < e(x)/2in B— Bur (on S$ s,s =0,1,---). 


Let a1, a2, - - - be the a, arranged in a sequence, and set m,=s if a, is in 
A,—A,_1. Set R=E-—A’. Let R, consist of those points of the R, of §19 
whose distances from the closed set A+B—B,_; are >1/p (p=1, 2,---). 
Every point of E—A’ lies in some R,, as B—(By)+B,+ - - - ) is void. Take 
the 8, (§19) small enough so that if |\,ox|<28, (see (22.8)) and g*(x) 
then 


(24.14) | Deg*(x)| < e(x)/8 in Roy — Rp (ox p). 


Let ¢/ be one eighth the lower bound of e(x) for x in Ruy: (s=1, 2, +--+), or 
1 if that is smaller. Replace the ¢, of (20.2) by min (¢,, ¢/). Then for any 
such g*(x), (16.1) gives 


(24.15) | DiLg*(x)| < ep +| Dug*(x)| < e(x)/4 in — Ry (ox p). 
Replace the g’(x) of §23 by the present g(x), and determine G’(x) so that 
(24.16) | DiG’(x) — Dig(x)| < min e(x)/4] in R— Ry (ox P), 


and, in particular, in B—B,_,. Now if G(x) and F(x) are determined as in 
§23, then (24.13) and hence property (3) hold; the other properties are easily 
verified, and the proof of the theorem is complete. 


UNIVERSITY, 
CAMBRIDGE, Mass. 


THE ASYMPTOTIC SOLUTIONS OF CERTAIN LINEAR 
ORDINARY DIFFERENTIAL EQUATIONS OF 
THE SECOND ORDER* 


BY 
RUDOLPH E. LANGER 


1. Introduction. The ordinary differential equation 


dv 
(1) + Api(s, + po(s, A)o = 0, 
ds? ds 


in which the coefficients p;(s, \) are expansible in descending powers of \ 
when || is large, includes as special cases many differential equations of 
classical importance. It is accordingly the subject of an extensive literature. 
In particular, the problem of the asymptotic dependence of its solutions upon 
the complex parameter \ has been the subject of many investigations, in 
which for suitably restricted configurations a theory of considerable general- 
ity has been deduced. 
A familiar change of variable gives to the equation the form 


d*u 
(2) + Aau(s) + Ju = 0, 
in which g.(s, A) is bounded for large values of \. The salient hypothesis, 
then, which has most generally been assumed and under which a developed 
theory is known, is that the variable s be real, and that on the basic interval 
considered, the characteristic equation 


6? — go(s) = 0 


have roots 0;(s) which are everywhere distinct.{ In recent papers the author 
has studied the asymptotic forms for a type of equation not subject to this 
hypothesis by virtue of the fact that the coefficient go(s) becomes zero at 
some point of the domain of the variable considered. The form of the solu- 
tions was determined, moreover, for the entire complex plane.{ With such a 
configuration of variables, the feature of prime interest was found to lie in 


* Presented to the Society, December 27, 1933; received by the editors July 21, 1933. 

t In this connection special mention is due to Birkhoff and Tamarkin. For references cf. Langer, 
R. E., On the asymptotic solutions of differential equations, etc., these Transactions, vol. 34 (1932), 
pp. 447-480. 

¢ Cf. the previous reference. 


90 


ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 91 


the incidence of the Stokes’ phenomenon, under which the analytic forms 
used for the asymptotic representation of any specific solution must be 
changed in a discontinuous manner as the point (s, \) varies across certain 
specifiable boundaries in the complex (s, \) space. This is attributable to the 
fact that the asymptotic representation utilizes mu tiple-valued expressions 
for the description of the single-valued solutions of the equation. 

Referring to the equation in form (2) the investigation cited was made for 
the case in which the coefficient go(s) vanishes at some point like any real 
positive power of the variable, but under the assumption that the term in 
does not occur, i.e., gi(s) =0. In the present paper the case in which the term 
in \ is present while at some point the term in \* vanishes to the second order 
is to be studied.* It is planned as a sequel to apply the results of this discus- 
sion to a study of the functional form of the solutions of the Mathieu equation 
over the complex plane, in a manner resembling that in which the earlier re- 
sults were applied to a study of the Bessel’s functions. 

It may be of interest to note that the case to be considered bears a certain 
formal limiting relationship to a class of cases which by suitable transfor- 
mations may be brought under the theory developed in the earlier papers 
cited. Thus, consider the equation 


2 


d*u 
— — {nqo(s) + + gas, = 0, 


ds? 


go(s) = s” ap ¥ 0, 
i=0 
and let the variable be changed by the substitution s=~*z, with o an un- 
determined positive constant. The equation takes the form 
d*u 
and if r<4/2+-» it is always possible to choose ¢ so as to make the second 
highest power of \ which occurs the zeroth power. Then if —p? is written for 
the highest power of \, the equation takes the form 
d*4 
— + {pao + x(z, p)}u = 0. 
dz? 
It is seen that when v= 2, as in the equation (2), the transformation is possi- 


* In this connection cf. for the case of a real variable Goldstein, S., A note on certain approximate 
solutions of linear differential equations of the second order (2), Proceedings of the London Mathe- 
matical Society, (2), vol. 33 (1932), p. 246. 


with 


92 R. E. LANGER © [January 


ble whenever r<1. For this class of equations the equation (2), in which 
r=1, may evidently be regarded as a limiting form. 

2. The normal equation. In the differential equation (2) let the zero of the 
coefficient go(s) be designated by so, and with c as a tentatively undeter- 
mined constant let the change of variable 


$= 3-%-— 


be made. With the use of the formal relations 


go(s) = qo(s + 50) + (s + + 


Bec 
gi(s) = gi(z + So) + + so+ =<), 


the differential equation may then be written 
(3) u!"(z) — {d*xP (2) + + )}u = 0, 
in which, specifically, 


(z) = go(z + So), 
xi(z) = gi(z + So) + (2 + So). 


Whatever choice of the constant c is made, the zero of the coefficient 
x0? (z) clearly occurs at the origin. It is readily found that c may always be 
so chosen that the relation 


(4) 3x¢ (0)xi (0) — 2x0’ (0)x:(0) = 0 


is satisfied, and this choice will be made inasmuch as it is convenient for sub- 
sequent purposes. The equation (3) will then (i.e., when (4) is satisfied) be 
designated as in normal form, and the preliminary normalization of the equa- 
tion will be assumed in the discussion which is to ensue. 

The more specific description of the equation for which a theory is to be 
deduced may be given, as follows, through the means of an enunciation of 
the hypotheses to be made. The variable z is to be complex, and is to vary 
over a simply connected (finite or infinite) fundamental region R,, which in- 
cludes the origin and in which the hypotheses, to be enumerated from (i) to 
(v) below, are simultaneously fulfilled. It may be considered as a blanket 
hypothesis upon the equation that some such region exists. The explicit 
assumptions are the following: 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 93 


(i) Within the region R, the coefficient x?(z) is analytic and has a zero of 
the second order at the origin. Moreover, except in the immediate neighborhood 
of the origin the functions 


xls) and 
0 


are bounded from zero. 
(ii) Within the region R, the coefficient x:(z) is analytic, and is such that the 
functions 


x1(2) f { xi(z)/xo(z) } dz 


» and 
(z) f 


are bounded except possibly in the immediate neighborhood of the origin. 

(iii) The coefficient x2(z, X) is analytic in R,, and in any finite portion of R,; 
is bounded uniformly when || is sufficiently large. 

The enunciation of hypotheses (iv) and (v), which are less transparent, 


will be deferred to §7. 
For definiteness the function arg xo(z) will be determined by the relation 


’ with a ~ 0, 


arg { lim 2 = 0. 


This involves no loss of generality since the adjustment may always be 
achieved by the transfer of a suitable constant factor from the parameter )? 
to the coefficient xo*(z). 
3. The related equation. Let the relations 
x1(0) 
(0) 
xi(2) 2kxo(z) 


xo(2) f (Se) = d&(z), 


a(s) 
(5c) = 2xo(z) + (5f) V(z) o'/2(z) 


serve respectively to define the symbols which occur upon the left. The func- 
tion »(z), which is analytic in R, because of the hypotheses (i) and (ii), is 


(Sa) (5d) 5(s) = f $(s)ds, 


(Sb) n(z) = 


94 R. E. LANGER [January 


found to vanish at the origin in virtue of the normalization (4). It follows 
from this and the hypotheses that outside of any neighborhood of the origin 
the functions ¢(z) and ®(z) are bounded from zero uniformly when |\| is 
sufficiently large, while at the origin they have zeros of the first and second 
orders respectively. The function (z) is accordingly analytic (with proper 
definition at z=0) and uniformly bounded in any finite portion of R,. Lastly, 
it may be shown that since the zero of ¢(z) is a simple one, the expression 


—3/¢\? 3/%¢\? 
2¢ 4\¢ 
is analytic (with proper definition at z=0). 


Let M,,.,,(~) denote the confluent hypergeometric function customarily so 
designated,* and let the functions ¥;(z), y2(z) be defined by the formulas 


(6) yi(z) = Mi j=1,2.f 
The functions M satisfy the equation 
1 — 4,2 
from which it may be found by direct substitution that the functions (6) are 
solutions of the differential equation 


(7) y"(z) — (2) + Axa(z) + +) }y(z) = 0, 
in which 


at) = 0, 


2kx0? f ndz 
2 k(4x0n + 


= + w($). 


® 
of xodz 
0 


This equation, (7), will be referred to briefly as the related equation. The 
similarity of its structure to that of the given equation (3), which is evident 
in so far as the coefficients of \ and \? are concerned, extends also to the re- 
maining element. For with the analyticity of the expression w(@) established, 
it is easily seen that the coefficient 2(z, 4), as well as x2(z, A), satisfies the 
hypothesis (iii). 

4. The solutions of the related equation. The solutions (6) of the related 
equation take on and are determined by the initial values 

* The formulas and notations of this and the following sections are taken from Whittaker and 
Watson, A Course in Modern Analysis, 3d edition, Cambridge, University Press, 1920, chapter XVI. 


ft It is to be consistently understood that when the index 7 is used in conjunction with double 
signs, then j= 1 is to be associated with the upper signs and j =2 with the lower ones. 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


(0) = 0, 0) = ¥(0), 
(8) 9:(0) y2(0) (0) 


{(0) = » (0) = ¥'(0). 
(0) yz (0) (0) 
For values of the variable £ which are numerically small they are con- 
veniently described by the formulas 


3 — 4k 


1 — 4k 
For large values of £ they are describable by asymptotic formulas. Due to the 
incidence of the Stokes’ phenomenon, however, such description is dependent 
upon the location of the variable &. 

The origin, z=0, is an ordinary point for both the given and the related 
equations, and the region R, accordingly consists of a single sheet. The rela- 
tion (Se), however, maps the z plane upon a Riemann surface having a simple 
branch point at £=0, and hence there corresponds to R, a two-sheeted region 
R; which is the domain of variation for the variable £. In this domain let the 
sectors =) be defined by the relations 


1 3a 


h= —2,—-1,0,1, 


(9) 


¢ denoting an arbitrarily small but positive and fixed constant. It is clear that 
these sectors overlap considerably and also that they completely cover the 
domain R;. The corresponding sub-regions of R, will be designated by the 
same symbols. It is to be noted that they depend upon the parameter A. 

When £ lies in the sector = and | £| is sufficiently large, it is known 
that 


1 1 
tettil4 


We 


95 

with 


R. E. LANGER 


iter E(é)]* 
Wee = (— 1 + et 


It may be noted that the formula as written is formally correct even in the 
exceptional cases when the index & is a quarter of a real odd integer. The 
term in which the gamma function is infinite is then merely to be omitted. 

When £é does not lie in = the appropriate formulas may be deduced by 
the use of those above in conjunction with the relations 


From these facts it may be computed that when | £|>WN,{ and £ lies in the 
sector =), then 


with coefficients given by the formulas 


(10) 


= 41, for hk = — 1, or 0, 
F for h = 1; 
(11) 
rg t+ pc” kh = — 2, or — 1, 
for hk = 0, or 1. 


oer, for h= — 2, 


It is to be understood that if & is such that one of the gamma functions is 
infinite, then the coefficient C which multiplies it in (11) is to be assigned the 
value zero. The formulas (10) differ for different values of 4. It is readily ob- 
served, however, that in any region common to two of the sectors Z their 
difference is asymptotically negligible. When it is a matter of choice as to 
which sector is to be considered as containing the point £, then the choice is 
always immaterial. 

The formulas 


which are found to have the inverse form 


* The symbol E will be used consistently to designate some function which is bounded. There 
is to be no implication that the symbol denotes the same function in different instances. 

The formula given for M,,, by Whittaker and Watson, p. 346, appears to be in error. 

t The symbolism | ¢|> 1 is to be interpreted merely as an abbreviation of the statement when 
| &| és sufficiently large. 


[January 
1 E 
2 
+ 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


(12b) yz) = + als) + 


define for each index / a set of solutions alternative to those described above. 
When £ lies in the sector Z‘™ they are found to be of the forms 


with 


(hum) (m) (A) (m) 
Aje — Ci,3-Co2 


These formulas reduce when m=h to give 


E(é) 


(13a) 


1/2 


W(z) 


and in the simplicity of these forms lies the advantage of the solutions in 
question. 

From the formulas (13a), and the fact that the Wronskian of any pair 
of solutions is independent of z, it is found directly that 


= 
5. Formal developments. Let the function @(z) be defined by the relation 
6(z) = xa(z, A) — Q(z, d). 
Then the given equation may be written in the form 
— (2) + + Q(z, }u(z) = 6(z)u(2), 


and is accordingly solved by any function u(z) which satisfies the relation 


1 2 


In this y(z) may be any solution of the related equation and the integration 
may be extended over any desired path which extends from any fixed point * 
to the variable point z. To each choice of these elements there evidently cor- 


— 


98 R. E. LANGER - [January 


responds a solution «. The correspondence between a solution « and the as- 
sociated function y will be indicated by the use of the same subscripts upon 
the former as may be attached to the latter. 
From the relations (5e) and (5f) it follows that 
dz = ==. dé, 

and hence if for any values of the arguments involved the functions Q are 
defined by the formulas 


(16) Qi(a, B, 6) = + {xn — 5}, 
the relation (15) may be written 


1 
(17) = 960) f 
It should be observed that differentiation of this formula leads simply 
to the associated one 


1 z 
When the form of u(z) has been derived, this formula serves readily to yield 
the form of u’(z). 

For the case in which the formula (17) is to be applied to the particular 
solutions w,,;(z) associated with the related solutions ¥,,;(z) it is convenient 
to define the symbols U and Y by the formulas 

(19) Une) = 4 (8) yh s(2)s 
in which y(z)=y(z), and y‘”(z) for »>0 is to be subsequently defined. 
Then formula (17) may be given in either one of the forms 

(20) (0) 

Un, = + 8), 
in which the final terms may each again be given by alternative formulas 
thus: 


ron Qily, y, 1)m we 


(21) I(u, 


d 
Avs & 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


and 


(22) J(U, «, &) 


The process of iteration, familiar from the theory of integral equations, 
may be applied to the equations (17) or (20). It leads formally to the rela- 
tions 


(23) = 92) + U(s) = YR) + HVE), 


with terms given by the recurrence formulas 


= +, 8), 


It is to be shown in the following that the series in (23) which have been 
obtained formally are convergent under appropriate circumstances, so that 
actual solutions are represented by them. 

6. The solutions “;(z) when || <V. With any choice of the constant N 
the region | | <W lies entirely within the domain R;, provided |X| is suffi- 
ciently large. Hence the straight line from the origin to any point of the re- 
gion lies within R;, and may be chosen as the path of integration in the formu- 
las (17) and (18). With this choice and with the réle of y(z) taken by y,(z), 
either one of the functions (6), the corresponding solution u,(z) takes at §=0, 
i.e., at =0, the same values as y,. Hence, from (8), 


u(0) = 0, u(0) = ¥(0), 


1/2 
uz (0) (0) 
With £ bounded the variables z and £; are likewise bounded. The bound- 
edness of the functions +, ;(z) and hence of the expression Q,(y, y, 1) follows 
directly, and accordingly formulas (24) and (21) yield the relation 


dé, t 
ard. |» 


(24) 


(25) 


Since y,(z) is bounded it follows directly by induction that for any index n 


Tt It is to be understood that M is used merely to denote some sufficiently large constant. 


| 


R. E. LANGER 


{iar}: 


The series in the formula (23) accordingly converges for sufficiently large 
values of ||, and is in fact of the order O(||-). 

The conclusion thus deduced, together with that which immediately fol- 
lows from (18) since y,,,(z)A~!/*, and hence Q,(y’, y, 1)A~"/?, are bounded, is 
the following, i.e., that 

E(z, d) 


uj(z) = yj(z) + when < N. 


7. Additional hypotheses. When £ is not restricted to be numerically 
bounded the considerations are less simple than those of §6. In particular 
some stipulations restricting the configuration of the domain of the variables 
must be made, and will be framed in the following way. 

A sub-region of R; will be styled as a region of the type r if it is simply 
connected (finite or infinite), and fulfills the following specifications: 


(a) that it contains the origin and lies entirely within some one of the half 
planes bounded by the axis R(t) =0; 

(b) that it contains no more than one segment of any line on which R(E) is 
constant. 


Concerning a region of this type, 7, it will be observed that it invariably 
contains upon its boundary (possibly at infinity if the region is infinite) a 
specific point £,, so located that there passes through each point £ of the 
region an ordinary curve, I’, which joins the origin with the point £,, and 
upon which the abscissa, #(£), varies monotonically with the arc length (in 
the sense of non-decreasing or non-increasing). The symbol I will be reserved 
for the designation of arcs of curves of the type described. 

As a hypothesis upon the fundamental domain R., and upon the range 
of the parameter X, it will be assumed that 


(iv) The region R, is such that for each admissible value of d every point of 
the corres ponding region R; lies in some sub-region of the type r. 


If the domain R, is finite no additional assumptions respecting the given 
equation need be made. However, if the domain is infinite the discussion to 
follow necessitates the further and final hypothesis, i.e., that 


100 [January 
(26) 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 101 


(v) For all arcs upon which |z|=A (A an arbitrarily large but fixed con- 
stant) and which for some admitted value of \ correspond to a curve of the type T, 


a relation 
<™ 
$(z) 


8. The solutions m,,;(z). Let any region of the type r be considered and let 
the index h be determined by a (any) sector =“ which contains the region 
in question. With y(z) in the formula (17) replaced by yz,;(z) the relation as- 
sumes the form (20) in which the limit of integration * still remains to be 
specified. This limit will be chosen as either *=£, or *=0 as dictated by 
the requirement that in proceeding along a I’ curve from the point * to the 
point £ the abscissa shall be algebraically non-decreasing in the case 7 =1, 
and non-increasing in the case 7=2. Inasmuch as the reasoning is entirely 
similar the discussion will be given only for the case 7=1. Then when the 
chosen region is one in which R(~)>0, * =0, while if #(£) <0 in 7, then 
*=£,,. 

Case 1. R() >0 in r. Since the integration extends from 0 to &, it is clear 
that the discussion of §6 applies without modification if | ¢ | <V. Hence 


is uniformly satisfied. 


(n) M 


When | £|>W the path of integration, which may be chosen as a I’ curve, 
contains a point £ for which | >| =. Then let the relation (24) in its second 
form be written 


The formulas (13a) and (19) show directly that when | £| is large the 
functions Y (2) are bounded. Since in the assumed configuration the quanti- 
ties &*e-§ and (& &")?*e-«-& are likewise bounded, it is clear from the first 
and second of the formulas (22) respectively that 


(29) 


Since 
(2) 


102 R. E. LANGER [January 


it follows, in virtue of hypothesis (v), that when =1 the integral on the 
right of the first of the relations (29) is bounded while that in the second rela- 
tion is of the order O(log | !). For this value of 1, therefore, the result 


M log| >| 
|| 

may be concluded, and in virtue of (27) the general validity of the relation 

(30) follows by induction. 


Case 2. R(~) <0 in r. In this case the integration extends from &, to the 
point £. If | the formula 


(30) | | < 


Vix) = , 
may be used, and since the right member is of precisely the structure of the 
final term of the relation (28) the conclusion (30) may be reached as in the 
preceding case. 
If | &|<WN the first of the formulas (24) may be written 
yrale) = Tyna”, + 

With the present. configuration the quantity &**eh is bounded, and the 
formulas (21) show respectively that 


| £0, | < yaa 


n— n— d. 
fal 


In the manner of the preceding discussion, it follows readily that 


M log| | )* 
@| < when |e] N. 


(31) | 


This relation obviously displaces (27) and is, therefore, valid for all regions r 
in which J. 

The results (30) and (31) assure the convergence of the series which occur 
in the formulas (23), when | \| is large. Recalling the relations (19) the results 
(together with those for the derived functions and for the case 7=2) may be 
enunciated as follows. 

Corresponding to any region of the type r there exists a pair of solutions 
of the given differential equation which in that region are subject to the de- 
scriptions 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


E(z, log 


E(z, log 


(32) 
un. i(2) = + » when |é| < 


and 
E(z, d) log 


Fk+1/4 E(z, log Xx 


un, ;(2) = yn,i(2) + ett/2 » when | > N. 


Un, = Ya, i(Z) + 
(33) 


It may be noted in connection with these formulas that in any region r 
within which | #(£) | is unbounded the solution of the sub-dominant form is 
unique except for a constant factor. This follows from the fact that every 
solution u(z) must be expressible in the form 


u(z) = + 2(z), 


with coefficients which are free from z, and unless the coefficient of the domi- 
nant solution on the right is zero the solution is itself of the dominant form. 

It should also be remarked that each set of solutions u,,;(z) of the de- 
scription (33) has been deduced for a specific region r, and, although the 
notation has not been designed to indicate the fact, this set of solutions for 
any one region 7; is not, in general, identical with that for another region rz. 
In a special but important case the existence of a set of solutions which re- 
tain the forms (32), (33) over two abutting regions r may be deduced as 
follows. Let r; and rz be two regions of this type which lie within one and 
the same sector 2‘ and which abut along the imaginary £ axis, r; lying in 
the half-plane R(£) <0, and in the half-plane R(¢) 20. Denote by 
the solution which is sub-dominant in 7; and by u,,2(z) the solution which is 
sub-dominant in r2. These solutions are linearly independent, as may be seen 
by comparing the respective formulas (33) along the imaginary axis. Hence 
una(Z) and u,2(z) are of the dominant form respectively in the regions re 
and 7, and are thus described by the formulas (32) and (33) over the two 
regions 7; and rz simultaneously. Such a pair of regions may constitute a 
large portion or even the whole of the region 2”. 

9. The solutions for general values of £. Between any three of the various 
solutions “(z) which have been defined there exists a linear relation with con- 
stant coefficients. Thus, in particular, for each index h 


(34) un, = a. + ag, j= 1,2, 


103 


104 R. E. LANGER 


with coefficients given by the familiar formulas 
i,j = 1, 2. 
W(m, u2) 


Since the Wronskians are all independent of z, they may be evaluated at the 
origin, and in virtue of the relations (8), (25), and (32) it is found that 


{ _ W(ys-i, E(a) log 
W(y1; 92) 
On the other hand, the relations (12) may be made to yield the equalities 
_ Wys-is Ya. (+ (h) 
W(y1, — 


whence it follows that 


the coefficients on the right being given by the formulas (11). 
With the values (35) the relations inverse to (34) are readily computed 
to be 


(36) 


a result which is obtainable independently of the value of h. Let any point z 
be given then, and let / be determined as the index of the region =“ in which 
the point z lies. The formula (36) may then be resorted to, and since for 
the value of z given the formulas (33) and (13a) are valid, the asymptotic 
expression for u,(z) is obtained. Quantitatively the results may be summa- 
rized as follows. 


The solutions of the given differential equation which are determined by the 
initial values (25) are given when | &|<N by the formulas 


u;(2) = +1/4(€) + 
¢'(2) 
2r$7(2) 


037) E(z, d) 


My 


[January 


1934] ASYMPTOTIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


When ||>WN and £ lies in a sector Z, then 


+ 4) (hy) k 


(38) 

uj (2) =o + 4) alee 
an which the coefficients are given by the formulas (11), and the symbol [C] desig- 
nates in each case an expression which differs from C by quantities of the order 
O(| and O(log 


Lastly, on substituting the forms (38) into the relation (34) (the values 
of / in the two expressions being not necessarily the same) the forms of the 
solutions u,,,(z) for general values of z may be derived. These results are the 
following: 


— [C;, 


For any index h= —2, —1, 0, or 1, there exists for the given differential 
equation a pair of solutions uy, ,(2z), which, when lg |>N and z lies in a regionr 
(or a pair of such regions which abut along R(t) =0) in the domain EZ, are 
of the form 


1 
= kett/2(1], 


(39) un i(Z) = EF /2[1 


When | &|>N but z is in the domain =, then the respective forms are 


h h, —§/2 
[A; (A ,m) } 


+ [Aja 


un, i(2) = ) 


with coefficients given by the formulas we and (11). 


’ 


It is sometimes possible to justify replacing a coefficient [0] by 0 itself 
on the ground that the forms given for two sectors =‘ which have a region 
in common must be asymptotically equivalent within that region. Thus, by 
way of illustration, if every sector 2‘ consists of but two regions 7, and 
there accordingly exist solutions having the forms (39) over the entire sec- 
tors =”, it is found that formulas (40) yield for — in the sector Z‘-” the 
value 


uo,2(2) = [0 + [1 ]é*e-t/2} 


106 R. E. LANGER 


Now unless the coefficient indicated as [0] is actually 0, the term in which it 
occurs is dominant in the region common to 2» and = and the formula 
conflicts with the appropriate formula (39). Hence in this case it must be 
concluded that the coefficient in question actually vanishes. 


Lastly, when |&|<N these solutions may be described by the formulas 


+1 1 1 
Un,i(2) = r( - >) + Cis -1/4(€) 


E(z, log 
= FA @ (2) r( (ati ante) Ms 


UNIVERSITY OF WISCONSIN, 
Manpison, WIs. 


| 


THE INVERSION OF THE LAPLACE INTEGRAL 
AND THE RELATED MOMENT PROBLEM* 


BY 
D. V. WIDDER 


INTRODUCTION 


Let a(#) be a complex function of the real variable ¢, of bounded variation 
in the interval (0, R) for every positive R, and such that the integral 


(1) fla) = f 


converges for some complex value of x. It is then known that the integral 
will converge for all complex x of greater real part and will consequently 
define a function, which we have denoted by f(x), in a half-plane. By the 
inversion of the integral (1) we mean the determination of the function a(é) 
in terms of the function f(x). There is one special case of (1) of particular 
interest, that in which a(#) is a step-function with jumps only at the integral 
points. Then 


(2) f(x) = 
n=0 


In this case the problem of inversion reduces to the determination of the co- 
efficients of the power series (2). If we set z =e~* in (2) we get 


F(z) = f(log (1/z)) = 


and there are two familiar determinations of the coefficients. One is in terms 
of a contour integral 
1 F(z) 
a, = — dz 
Qridc 
where the contour C may be taken as a circle with center at z=0 and with 
any sufficiently small radius, the integration being in the positive sense. The 
other is 
a, = F™(0)/n!. 
If we return to the function f(x), the contour C becomes a vertical line in 


* Presented to the Society, December 29, 1932; received by the editors January 13, 1933, and 
in revised form in June, 1933. 


107 


108 D. V. WIDDER 


the x-plane. Likewise we should have 


1 d* 
a, = lim— lo -) 
n! dz” 


as x becomes infinite along the positive real axis.* 

Reasoning from this special case, we should expect that for the general 
case (1) there would be two determinations of a(¢), one in terms of a contour 
integral along a vertical line, the other in terms of the derivatives of f(x) 
for large real positive values of x. The first of these is in fact very well known. 
For the special case in which 


(3) a(t) = f o(u)du 
0 


(4) f(x) = f e~*'h(t)dt 


it was already known to Cauchy in the form 


1 c+ ico 
— f 


where c is a sufficiently large real constant and the path of integration is a 
vertical line. The general case (1) seems first to have been treated by H. 
Hamburger.f It is remarkable that the second method, discovered the first 
in the special case (2), has received no attention until very recent times. It 
was apparently discovered first by W. P. Mason in his researches in electrical 
theory of 1929. No rigorous derivation of the inversion formula was published, 
however. In 1930 E. L. Post did obtain an inversion formula of the type in 
question for the special case in which the function a(#) has the form (3) with 
¢(u) a continuous function.{ It is the purpose of the present paper to obtain 
an inversion formula for the general integral (1). 

By way of introducing this inversion operator let us consider first func- 
tions f(x) which are analytic at infinity and which vanish there. That is, f(x) 


* We could of course take the approach along any parallel line or along any curve proceeding 
indefinitely to the right. 

+H. Hamburger, Uber eine Riemannsche Formel aus der Theorie der Dirichletschen Reihen, 
Mathematische Zeitschrift, vol. 6 (1920), p. 6. See also D. V. Widder, A generalization of Dirichlet’s 
series and of Laplace’s integrals by means of a Stieltjes integral, these Transactions, vol. 31 (1929), p. 
708. We shall refer to this latter paper as I. 

} E. L. Post, Generalized differentiation, these Transactions, vol. 32 (1930), p. 772. 


[January 
and 
| 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 109 


can be represented for |x| sufficiently large by a convergent power series in 
1/x, 

bn 

(5) 


If we note that 


1 
= f e~**—dt 
0 n ! 


for all x whose real part is positive, we see that 


f(x) = f 


ot) = 
n=0 n! 
Now introduce the operator 
= 1 k k k k+1 
=< 


k! t 
For the series (5), 


As k becomes infinite, 
(n+)! 
lim ———— = 
ko k! k” 


so that, if it were permissible to interchange summation and limit signs, we 
should have 


lim = >> = 
n=0 


This leads us to introduce an operator L,[f(x)] by the equation 
L.[f(x)] = 


We could easily justify the formal steps taken above and show rigorously 
that 
= (0<t<o). 


However, our purpose for the moment is merely to provide a heuristic ap- 


4 


110 D. V. WIDDER [January 


proach. Much more general cases, in which the method of series will not be 
applicable, will be treated later. 

We may now see what sort of operator will apply to the Stieltjes integral 
(1) by performing an integration by parts, 


f(x) = 


If we apply the operator L, to the function f(x)/x, we expect, guided by the 
above formal work, to obtain the function a(#). On the other hand we can 
show, as we do in §3, that 


= + (— f 


k/t 


u*® 


We are thus led to introduce a second operator S; by the equations 


k 


= + (— 1) f= 


(6) k/t 
= lim 


We shall show in §3 that our conjecture is verified, that for every convergent 

integral (1) 

a(t +) + a(t —) 
2 


Silf(x)] = 


at least if a suitable constant is added to the function a(#) so as to make 
a(0) =0. (This change in a(#) of course produces no change in f(«).) 

In §4 we treat the most general integral (4) where the function ¢(é) is 
integrable in the sense of Lebesgue, and we find that the operator L; inverts 
the integral for almost all positive values of x. The remaining sections of 
Part I are devoted to further inversion formulas, it being always understood 
that the function f(x) is known to have a representation (1). In Part II we 
drop this assumption and consider all functions f(x) to which the operators 
L and S are applicable. Their very nature demands, of course, that the func- 
tions f(x) must have derivatives of all orders and must have certain asymp- 
totic properties as x becomes positively infinite. We are thus able to develop 
necessary and sufficient conditions that a function f(x) should have a repre- 
sentation (1). Among other results we prove a theorem of S. Bernstein to the 
effect that if f(x) is completely monotonic, 


(7) (— 1)*f(x) 2 0 


(k =0,1,---), 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 111 


it must have the representation (1) with a(#) a non-decreasing function. By 
use of our operator S this result becomes almost self-evident. For, if (7) 
holds, S;,+[f(x)] is clearly a non-decreasing function of ¢t. The same must be 
true of S;,[f(x)] if this function exists. The existence of this limit of course 
requires proof. 

In Part III we take up several applications. First we discuss the zeros of a 
function f(x) represented in the form (1). To make our results more descrip- 
tive let us consider here the case (4) where ¢(#) is real and continuous. We 
are able to show that if @(¢) has just m changes of sign in (0, ©) then f(x) 
will have exactly the same number in the interval of convergence of (4) for 
all & sufficiently large. We are even able to compute the coordinates of the 
zeros of ¢(#) in terms of those of f(x). Thus if the zeros of #(#) are at the 
points 4, ta, where 


and if those of f (x) are at the points %2,x, * , Where 


4144 > °° > 
then 


lim t; 


In an article in the Proceedings of the National Academy of Sciences* we 


announced this result with the restriction that ¢(#) should approach a limit 
as ¢ becomes infinite, observing that the condition was probably redundant. 
J. Karamataf in a subsequent note of the same Proceedings removed this re- 
striction but imposed another. In the present paper we remove all conditions 
of the type, demanding only that the integral (1) should converge, a condition 
imposed by the nature of the problem. We do not even demand that a(#) or 
¢(é) should be continuous. 

In Part IV we treat the complex case. The integral is taken in the form 
(4), and the function ¢(#) is supposed analytic in the half-plane for which the 
real part of ¢ is positive and of such a nature that the integral (4) converges 
when the path of integration is the positive real axis. We are then able to 


show that 
Lil f(x)] = o@ 


for all complex ¢ whose real part is positive. By use of this result we are able 
to treat also the complex zeros of ¢(#) in terms of the complex zeros of f(x). 


* D. V. Widder, On the changes of sign of the derivatives of a function defined by a Laplace integral, 
Proceedings of the National Academy of Sciences, vol. 18 (1932), p. 112. 

1 J. Karamata, Remarks on a theorem of D. V. Widder, Proceedings of the National Academy of 
Sciences, vol. 18 (1932), p. 406. 


112 D. V. WIDDER [January 


In Part V we take up a moment problem intimately related to the La- 
place integral. If in (1) we allow x to take on only positive integral values we 
have by an obvious change of variable 


(8) f e-™da(t) = f = 0,1,2,---). 
0 0 


The determination of §(é) in terms of the sequence {y,} is the moment prob- 
lem of Hausdorff. Since the variable m now runs through a discrete set of 
values we expect to get an operator applicable to the sequence {u,} as the 
operator L was to the integral (4) by replacing the derivative of order k by 
a difference of order k and by replacing a (k+1)th power of x by a product 
n(n+1) - - - (n+k). Proceeding in this way we are led to define the operator 
Li {un} by the equations 


(n+k+1)! ht 
{un} nik! (— 1)*A* yun, F | 


Li{un} = lim un}. 


Here [u] means the greatest integer contained in u. We find in fact that this 
operator does invert the moment sequence (8) if B(#) has the form (3) with 
$(¢) integrable in (0, 1). That is, 


Li {un} = o(t) 


almost everywhere in (0, 1). In defining an operator S which will be appli- 
cable to the general sequence (8) we again proceed by analogy replacing the 
integral sign by the summation sign and the derivatives by differences in (6). 
In this way we arrive at the operators 

(i+)! 


t=n+1 ilk! 
= tim | 


We then prove that 


We then turn to sequences {u,} which are not known to be moment se- 
quences and discuss the efiect of the operators L and S on them. We are able 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 113 


to obtain necessary and sufficient conditions that a given sequence should 
be a moment sequence. In particular Hausdorff’s theorem to the effect that 
every completely monotonic sequence, 


(— 1)*A*u, 2 0 (k =0,1,---;”=0,1,---), 


has the form (8) with 6(¢) non-decreasing is obtained by use of the moment 
operator S,{u,.} with the same ease as Bernstein’s theorem was obtained by 
use of the integral operator S,[f(x) ]. 

As a further application of the operators Z and S a study is made of the 
changes of sign of a sequence {up} as affected by the changes of trend in 
A(t) or by the changes of sign in ¢(#). Results analogous to these already men- 
tioned for the Laplace integral are obtained, results which serve to general- 
ize certain theorems of M. Fekete.} In conclusion the complex case is treated. 
A slight modification of the operator L is necessary to give it meaning for 
complex #. In the foregoing definition we merely replace [ki/(1—#)] by 
kt/(1—12) thus defining an operator which we denote by L; {u,}. We find that 
if $(¢) is analytic in a circle of unit diameter with center at =}, then through- 
out that circle 

Li*{ un} = 92). 


The method found to be most serviceable in the major part of this paper 
is the Laplace method of determining an asymptotic expression for an integral 
of the form 


f Lf) 


when k becomes infinite. In the last section of the paper we use a modifica- 
tion of this method due to Perron for the case in which the integrand is com- 
plex and the path of integration is in the complex plane. 


Part I 
INVERSION FORMULAS 


1. The problem. In Part I we shall discuss functions f(x) of the real 
variable x which are known to be expressible by means of a Laplace-Stieltjes 
integral 


where the function a(é) is a real function of bounded variation in the interval 


1 M. Fekete, Sur les changements de signe d’une fonction continue dans un intervalle, Paris 
Comptes Rendus, vol. 190 (1930), p. 1366. References to Fekete’s earlier work on the subject are 
given in this article. 


114 D. V. WIDDER [January 


0<t<R for every positive R, where a(0)=0, and where the integral con- 
verges for some value of x. We shall obtain a formula for the determination 
of a(#) in terms of the values of f(x) and its derivatives. It will appear that a 
knowledge of these values in a neighborhood of x= + will be sufficient. In 
particular if a(¢) is the integral of a function ¢(#) so that (1.1) becomes 


(1.2) f(x) = 


we shall obtain a similar formula for ¢(?). 
2. A preliminary limit. In order to develop the inversion formula for 
(1.1) we find it useful to know the value of the following limit: 


for all positive ¢. To show the existence of this limit and to obtain its value 
we first express it as a definite integral by use of Taylor’s formula with exact 
remainder, 


(x—u 


2! k! 0 k! 


For our purposes take f(x) =e* and replace x by &/t. Then 


k 4 + 4 (=) + 1 ye 
é = _ — { — — — — = u 
t kiJo t 


k 1/k\? 1/k\* 
t 
1 kit/ k 1 kit 
(— ev dy = 1 — uke-“du. 
t kiJo 


If we set u=ky this becomes 


1/t 

= 1-—— f ay. 
k! Jo 

We can show at once that H;(¢) approaches 1 as k becomes infinite for ¢>1. 

For, the function ye-” has a single maximum at y=1, and is consequently 

increasing for y <1, decreasing for y>1. For ¢>1 this maximum is outside the 

interval of integration. Hence 


Rett e kit 


|Hi(t) < 


1934) THE INVERSION OF THE LAPLACE INTEGRAL 115 


The right-hand side of this inequality approaches zero as k becomes infinite, 
as one sees by use of Stirling’s formula. 
Next suppose that ¢<1. Since 


k! -f e~“ukdu 
0 


we may clearly obtain the following expression for H;(#): 


1 k k+1 
H,(t) = — u*e-“du = f 
If ¢<1, the function ye~” is a decreasing function for y>1/t, so that 


1 k-l pkt+l 
| H(t) | ye-“dy. 
t kl 
The right-hand side of this inequality approaches zero as k becomes infinite, 
so that H,(#) must do so also. 
It remains to treat the case ¢=1. We have 


1 k 
H,(1) =1- u*e-“du, 
kiJdo 


and by a direct application of Laplace’s method* we see that 


ko 


We have thus proved 
THEOREM 1. On setting 


H,(t) = +—(-) ] 
Wha Rive) J 


kit u*® u*® 
H,(t) =1- f —e—“du = f 


we have 


and 
0if0<t<1 


lim H;(t) = Lift=1 
lifi<t<o., 
If we set g(#) =lim,... then 


* G. Pélya and G. Szegi, Aufgaben und Lehrsdtse aus der Analysis, vol. I, chapter 2, p. 80, 
problem 210. The result was known to Jacobi; cf. Gesammelte Werke, vol. 7, p. 213. 


= 


D. V. WIDDER 


0 


so that Theorem 1 gives us an inversion formula for the integral (1.1) in this 
special case. 

3. The inversion of the Laplace-Stieltjes integral. From the result of 
Theorem 1 we can now conjecture the inversion operator for the general 
integral (1.1). We introduce the following 


DEFINITION. An operator S,|f(x) | is defined by the equations 


yk 


(3.1) Seelf()] = fo) + (— f f0(u)du (k = 0,1,2,-++), 
kre 
(3.2) Salf(a)] = lim Se 


For this operator to be applicable to a given function f(x) it is necessary 
that each of the operations involved in the definition should be well defined 
for the function. Thus f(x) must have continuous derivatives of all orders, 
must approach a finite limit as x becomes infinite, the improper integrals 
(3.1) must converge, and the limit (3.2) must exist. The operator is clearly 
distributive. We shall show in this section that it is well defined if f(x) has 
the representation (1.1) and that it serves to invert that integral. The result 
to be established is the following: 


THEOREM 2. If the integral 


(3.3) f(x) = [emda (a(0) = 0) 


converges for x >c, then 


a(t +) + a(t —) 
2 


S.[f(x)] = (t> 0). 


Since the given integral converges for x>c we know that there exists a 
constant M such that 
(3.4) | a(t) | < Mest (OSt<o), 


where g is a positive constant greater than c.* Hence, on integrating by 
parts, we obtain 


* D. V. Widder, I, p. 703, Lemma 2. 


116 [January 


THE INVERSION OF THE LAPLACE INTEGRAL 117 


f(a) = tim + (x > g), 
(3.5) 
= —zto(t)dt. 
I(x) e~**a(t)dt 


Set f(x)/x=F(x) and introduce a new operator S by the 
DEFINITION. An operator L,[f(x)] is defined by the equations 
= (— (k = 0,1, 2,---), 
L.[f(x)] = lim Li 


We shall show that this operation is well defined when applied to F(x) 
and that the result of the operation is [a(#+)+a(t—)]/2. If x>g equation 
(3.5) gives us* 


F(x) = (— 1)4 f 


0 


1 k k+1 


t 
Let ¢ be an arbitrary positive value of ¢ and make the transformation 
u=y/to. Then 
k+1 


Lx.t,[F(x)] = e~*“uka(tou)du. 
! Jo 


By use of the function g(¢) of §2 we now define the function 
= [alto +) — a(to —)]g(u) + —). 
This function has the properties 
¥(1 +) = a(to +), 
¥(1 —) = —), 
a(to +) + a(to —) 
2 


¥(1) = 
so that the function 
o(u) = a(tou) — ¥(u) 
has the properties 


* D. V. Widder, I, p. 702, Corollary 2. 


1934] 
Hence 


118 D. V. WIDDER 


(3.6) ¢(1 +) = o(1 —) = 0. 
We wish to show that the difference 


a(to +) + alto —* 
2 


(x) ] 


approaches zero with 1/k. But 


Rett 


k! 


Rett 


+ a(to —) 
k! 0 


= [a(to +) — —)] 


= [a(t +) — a(to —)] + a(to —) 


= [a(to +) — a(to —)]He(1) + a(to —) > [alto +) + a(to—)] 


Hence 


a(to +) + —) = lim [ee tot) — ¥(u)]du. 
0 


(3.7) Li, [F(x) ] 2 me &! 


By (3.6) we see that to an arbitrary positive quantity ¢ there corresponds a 
number 7 such that 


(3.8) | | < «/3 (0<|1—u| << 1). 
Now divide the interval of integration in the right-hand member of (3.7) 
into (0, 1—n), (1—n, ©), the corresponding contributions being 
defined as J;, J2, and J; respectively. Then 

| Is] <— f < — f = — - 

3 ki Ji, 3 ki Jo 3 
If we denote by K an upper bound of |¢()| in the interval (0, 1), we have 
as in §2 


1 
Ke-*0-9 (1 — 


k! 
Since the right-hand side of this inequality approaches zero with 1/k we can 
determine ko so large that 

| I1| <€/3 (k > ko). 
Finally, since 


| a(tou)| +] 
S Meo + 3Me%* < Nev 


| p(w) | 
| | 


| 
[January 
(y > 0), 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


we have 


peti 
(3.9) |I3| < + f 


Here ) is any fixed number greater than 7, so that the integral (3.9) converges. 
The right-hand side of (3.9) tends to zero with 1/k. Hence we can find kh; 
greater than ko such that 


| Is| < €/3 (k > hi). 
Then for k>&, 


a(to +) + a(to —) 


L,t,[F(x)] — 


a(to +) + a(tc —) 


Li, [F(x)] 2 


It will now be established that 
Lit [F(x)] f(2)], 


and this will complete the proof of the theorem. It is not evident that the 
operator S has a meaning as applied to f(x). To show that it has we prove 
first that 


(3.10) (— 1)*f™(2) = f 
0 


Indeed, if we set 


(3.11) B(0) = 0, = f (— 1)"u"de(u) (> 0), 
0 


where 
w(0) = 0, w(u) = a(u) — a(0 +) (u > 0), 


we have 


= f 


= 2% e~*'B(t)dt, 


B(t) = o(¢”) 


119 
and 
and 
(t— 0). 
| 


120 D. V. WIDDER [January 
This follows since w(u) is continuous at u=0. The total variation of w(u) in 
the interval (0, 4), which we denote by V(t), approaches zero with ¢ and 

| a(t) | < eV(t). 


If € is an arbitrary positive number we can determine 2 number 6 such that 


|a@|<— — (0<#<9). 
2 n! 
Hence 
(3.12) e~**B(t)dt | << — < — e-*4"di = . 
0 2 2 niJo 22* 
By integrating (3.11) by parts one sees at once that 8(#) satisfies an inequality 
| B(t) | < M’es’ (0<t<»), 


where M’ and g’ are suitable positive constants. Consequently 
(3.13) e~*"B(t)dt| < M's dt = — g’) 
= 0(x-") 


Combining inequalities (3.12) and (3.13) we have 
(3.14) f{™(x) = o(a-*) (x ©,n>0). 
For n=0 

fs) «(0 +) = f 


In this case 
B(t) = w(t) = o(1) 
and the above proof shows that 
f(x) — a(0 +) = o(1) 
whence 
= a(0 +). 
To show that the improper integral 


k 
(3.15) f 
k/t 


converges, we proceed by induction. The integral 


(¢-— 0), 
(x @), 
(; 
><) 
t 


THE INVERSION OF THE LAPLACE INTEGRAL 


= lim f(R) — 


k/t 


clearly converges to the value a(0+) —f(k/t). Suppose that we have shown 
that the integral (3.15) converges for k=m—1. The equation 


fir +0 (4) = (-) fr 


together with the relation (3.14) shows that (3.15) also converges for k =m, 
and thus we see that S;,:[f(x) ] exists for every positive t. 
Successive integration by parts shows that 


=1(2) 


+ "(=) (-)- (— yet (k) (= 
t t t 


On the other hand we have 


n=0 


k 


Hence 
L,[F(x)] = S.[f(«)]. 


This completes the proof of the theorem. In the course of the proof we have 
established a result of interest in itself and which we state as 


THeEorEM 3. If the function (t) is of bounded variation in the interval 
(0, R) for every finite R, if for some constant c 


(3.16) o(t) = O(e*) (t> ©), 
and if 
F(x) = f e~*'g(t)dt, 


L.[F(x)] = 


Later we shall prove a much more general result of this same character. 


1934] 121 
then 


122 D. V. WIDDER [January 


It is of interest to illustrate the foregoing theory by the following ex- 
amples: 
F(x) =Tiy+1)a™, =? (y 2 1); 
f(x) a) = O<7<1)); 
= a(t) = (1 — e*)/c (c > 0); 
where the limits involved in the definition of the operators ZL and S may be 
directly computed. Theorem 3 is not applicable to the function '(y+1)a-7— 
for y <1 since the function ¢(#) =? is not of bounded variation in any interval 
including the origin. 
4. The inversion of the Laplace-Lebesgue integral. In this section we de- 
velop an inversion formula for the case in which a(é) in (1.1) is the integral 
of a function ¢(#), 


of) = $(u)du, 


where ¢(u) is merely integrable in the sense of Lebesgue. Then (1.1) takes 
the form (1.2). We shall be able to show that if (1.2) converges for some value 
of x (and hence for every greater value), then 


L.[f(x)] = o) 


for almost all positive values of ¢. It is important to note that no restriction of 
the type (3.16) is imposed on ¢(#), so that the result is the best possible one 
for integrals of the type (1.2). We state our result in 


TuHeoreEM 4. If the function $(t) is integrable in the interval (0, R) for every 
positive R and if the integral 


(4.1) = f 
0 


converges for x>c, then 
L[f(x)] = 
for almost all positive values of t. 


Since ¢(#) is integrable we have* 


(4.2) | ou) du = of 


for almost all positive values of to. 


* See, for example, L. Tonelli, Serie Trigonometriche, p. 174. 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


Let us fix attention on such a value &. Set 


at») = f [o(u) — 


We wish to show that 
L.,[f(x)] = $(to). 


x 


In view of 


we have only to show that the integral 
(to) yk 
Le] = f en kulty (-) —[o(u) — 
x 0 to k! 


approaches zero as k becomes infinite. By introducing the function A(t, to) 
this becomes 


0 k k+1 


0 
Set 


f 
Since the integral (4.1) converges for x>c, there exist positive constants M 
and ¥ such that 
(4.4) |-y(t)| < Mev (0<t<~), 
On account of the relation 
B(t, to) = v(t) — (to) — o(to)(¢ — to) 


it is clear that A(t, to.) also satisfies an inequality of the type (4.4). Hence, on 
integrating (4.3) by parts, we see that the integrated term vanishes if & is 
sufficiently large (k >g, say). Thus 


(to) 


x 


k\*t1 1 ku*® 
I(k) = (-) —{ B(u, du. 
to kiJo to 


Le] ] = I(k), 


123 
| 
where 
j 
i 


124 D. V. WIDDER | [January 


In the integral J(k) make the change of variable u=fyy. We thus obtain 


Rett 


f (toy, to) — 1)dy. 
0 


Corresponding to an arbitrary positive ¢ there is a number 7 so small that 
(4.5) | B(yt0, to) | < y— 1] /3 (ly-1| <2) 


by virtue of (4.2). Divide the interval of integration into the three parts 
(0, 1—n), (1—n, ©), denoting the corresponding contributions 
by 1,(k), I2(k), Is(k) respectively. Then 


k 1 
f toy, to) | (1 — y)dy. 
The right-hand side of this inequality, and hence also the left, approaches 
zero with 1/k. 

Next consider the integral J;(%). It converges absolutely for numbers k 
greater than g. Let &; be such a number. Then the integral 


1 
(4.6) |n(&)| s— 


A= f e“hy| B(toy, to) | — 1)dy 
1 
converges and 
1 Rett 
| to (k— 1)! 


The constant A is independent of & and one sees easily that the right-hand 
side of (4.7) approaches zero with 1/k. Hence we may determine fp so large 
that for k>ko 
| | < €/3, | Ia < €/3. 
As for J2(k), we have by virtue of (4.5) 
€ 1 Rett 
I2(k —ku(y — 1)2yk-1g 
| 72(R) | (y — 1)*y*Idy 
—ku(y — 1)*y*-Idy, 
By use of the gamma function we see that the right-hand side of the latter 
inequality reduces to €/3 so that 


| 1(k)| <e (k > ko). 
Hence 


I(t) 


THE INVERSION OF THE LAPLACE INTEGRAL 


Lil f(x)] = $(¢0) 
and the proof is complete. We add the following 


Coroxiary. If the function $(t) is integrable in the interval (0, R) for every 
positive R, is continuous at t=to, and if the integral 


Li, f(x)] = (to > 0). 


converges for x >c, then 


By use of this result we are now able to generalize Theorem 3, removing 
the condition (3.16). Without loss of generality we may suppose that 
o¢ +) + o¢ —) 
2 


(4.8) o(t) = 


From the function g(é) of §2 we construct the function 
wo(u) = [6(to +) — o(to —) ]g(u/to) + —). 
This function is seen to satisfy the conditions 
w(to +) = o(to +), 
w(to —) = —), 
w(to) = (to). 


Consequently the function ¢(u) —w(u) has the value zero at u =f) and is con- 
tinuous there. Set 


G(x) = 


The corollary of Theorem 4 gives us 
Li, [F(x) — G(x)] = — w(t) = 9, 
L1,[F(x)] = Lt,[G(x)]. 


Since w(u) is bounded, and thus satisfies condition (3.16), it follows that we 
may apply Theorem 3 to G(x). Thus 


L1,[G(x)] = = $(t0) 
and 
Li, [F(x)] = $(t0). 


1934] 125 

| 


126 D. V. WIDDER [January 


We have thus proved 
THeEoREM 5. If the function (t) is of bounded variation in the interval 
(0, R) for every positive R, and if 


F(x) = f ema, 
0 


the integral converging for some value of x, then 


ot +) 
2 


L,[F(x)] = 


5. Uniform convergence. We have seen that if ¢(#) is continuous at 
then L,,[f(x) ]=@(%). If is continuous in an interval a<i<b, then 
the equation holds for each ¢ of the interval. We now show further that as k 
becomes infinite the sequence of functions L;,[f(x)] tends to ¢(é) uniformly 
in any closed sub-interval of (a, 6) not including the end points. The precise 


result is stated in 


THEOREM 6. If the function (t) is integrable in the interval (0, R) for every 
positive R, is continuous in the interval 0S a<t<b, and if the integral 


(5.1) f(x) = f “e-#'6(t)dt 
0 


converges for some value of x, then 


Lit [f(x)] = 


uniformly in the interval a’ <t<b', where a<a’ <b’ <b. 


Defining the function A(t, to) as in the proof of Theorem 4, we obtain 


yk 
(5.2) Lit [f(x)] — = (u, t). 


By virtue of (4.4) we have 
| B(u, t)| < Me™ + Me +| 
Denote by N the maximum of ¢(?) in the interval a<¢<b. Then 
| B(u, Me?+N(u+b)S Mer (OSu<@) 


if M’ is suitably chosen. Hence, on integrating (5.2) by parts the integrated 


E 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


term vanishes for all ¢ in the interval (a, 6) provided k>by. Then 
k+1 1 


k 


Make the change of variable v=/t: 
Rett 


I(k) = — 1)do. 
(k—1)! tJo F 


Since ¢(#) is continuous in the closed interval (a, 5), to an arbitrary positive 
¢ there corresponds a number 6 such that for any two points ?#’ and ?#” of 
(a, b) 
(5.3) | — | < 
provided only that | ¢’—#’| <5. Now choose a number 7 satisfying the in- 
equalities 
(5.4) (6 S (a’ —a)/a’. 
With this number 7 define the integrals /,(k), J2(k), Is(k) as in §4. From 
(4.6) we now obtain 

1 Rett 


1 
(1 — 2) (1 — y)dy 


| 7.(k)| 


The right-hand side of this inequality is independent of ¢ and tends to zero 
with 1/k. 

We consider next the integral J;(). Determine ki >yb. Then from (4.7) 
we obtain 


— 1)! 


B= f — 1)dy. 
1 


Again the right-hand side is independent of ¢ and approaches zero with 1/k. 
In order to discuss J2(k) we note first that 


| t)| — 1] /3 
if 1—y»<v<1+7 and if a’ <i<0’. For, these inequalities imply that 
=t|v-1| Sdn 


Moreover #v and ¢ lie in the closed interval (a, b) since the inequalities 


127 
1 
a’ 
where 


D. V. WIDDER [January 
imply 
ws 
or, by virtue of (5.4), 
tus b. 


In a similar way the inequalities ¢= a’ and v=1—7 imply t=a. It follows that 


| B(to, | s| f "| o(u) — | du 


Hence 


k+1 


1 € 1+9 


€ 
< — = — - 
(k-1)! 3 Jo 3 
Consequently we may determine a number fz independent of ¢ such that for 
k>ke 
| 1(k)| <e, 


and the proof is complete. 
From this result follows immediately 


THEOREM 7. If the function a(t) is of botinded variation in the interval 
(0, R) for every positive R, is continuous in the interval 0 <a Si Sb, if a(0) =0, 
and if the integral 


f(x) = f e~**da(t) 
0 
converges for some x, then 


lim Siwlf(x)] = a(t) 


uniformly in the interval a’ <t<b’, where a<a' <b’ <b. 


For if x is sufficiently large 
f(x)/x = f e~*a(t)dt, 
0 


and we have already seen that 


Lt[f(x)/x] = 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 129 


By Theorem 6 the left-hand member of this equation, and hence also the 
right, approaches a(#) uniformly in the interval (a’, b’). 

The interval of uniform convergence may, under certain conditions, be 
extended to infinity. For example, we have 


THEOREM 8. If the function $(t) is continuous in the interval 0 St< ©, and 
if (t) tends to a finite limit as t becomes infinite, then 


lim Liwlf(x)] = 


uniformly in that interval provided L;,o[f(x) | is defined as $(0). 

Under the present hypotheses the integral (5.1) converges absolutely for 
x>0 and the method of proof is greatly simplified on that account. It will 
be seen that we must show that 


lim f e-*vy*[o(ty) — o() Jay 
0 


k! 


approaches zero uniformly in the interval 0<i<. The details of the proof 
are left to the reader. 

This theorem is a result which the author stated without proof in an 
earlier note.* Another result which we enunciated in that note we record here 
asa 


Coro.iary. The function 
tends uniformly to zero in the interval O<zx < as k becomes infinite. 


To prove this take the function f(x) of Theorem 8 as (1+) and 
Then 


(«> —1). 


Since e~‘ approaches zero as ¢ becomes infinite and is continuous in the 
interval 0<t<«, Theorem 8 is applicable, so that 


1 t —k-1 
lim L = lim{1+— =e 


* This was the Proceedings article mentioned in the Introduction. See Theorems 1 and 3 of that 
note. It will be found that the statement of Theorem 3 is somewhat different from the statement of 
Theorem 8 above, but the equivalence of the two results may be seen by making the change of 
variable k/x=t. 


1+4-x 


130 D. V. WIDDER 


uniformly for 0<i<. If we set t=kx we have 
lim [(1 + x)-*-! — e-**] = 0 


uniformly for O<*<. But 
(1+ — = [(1 + — + + 


The minimum of the function x(1+ x)-*“ in the interval O<x<o is 
k*(k+1)-*-". This approaches zero with 1/k, so that the corollary is proved. 
6. Further inversion formulas. We now prove 


THEOREM 9. If the integral 


f(a) = 
0 
converges for some value of x, then 
k 
ko k/t k! 
_ a(t +) + alt —) 


= (t > 0) 


for any constant c. 


This theorem is a generalization of Theorem 2 and reduces to that result 
when c is put equal to zero. To prove it we note as before that for sufficiently 
large positive values of x we have 


F(x) = f(x)/x = f e~*ta(t)dt. 
0 


F(x+c)= f dt, 
0 


we have by an application of Theorem 3 


lim (= (— + (-)" _ +) + at -) 
k! t t 2 


kw 


Set 


k 


(6.1) I= + ayes f + c)du. 


k/t 


That this integral converges for any c and for sufficiently large values of k 
will become apparent when we replace f(x) by xF(x): 


[ 
{January 
(a(0) = 0) 
Since 
| 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 131 
gk 

Tn = f() + (— 1)*# f (uo) + 


By integration by parts we obtain 


k 
+ lim (— 1)**#\(u + +c) 


pckiu 


e 
+(- f (4 + c)du. 
k/t k! 


To show that the integral (6.1) exists it will be sufficient to show that the 
limit involved in the third term of (6.2) exists and that the integral of (6.2) 
converges. But we saw in §3 that 


and that 


lim f(x)at = 


lim (— = (k = 0,1,2,---), 


k 
lim (— (u + + 0) = — 


so that the first and third terms of (6.2) may be omitted. To show that the 
integral (6.2) converges we appeal to the inequality (3.4), from which it 
follows immediately that 


| F(x) | < Mk!/(x — g)**} (x > gg). 
By application of this inequality we see easily that the integral in question is 


O(1/k) as k becomes infinite. Since the second term of (6.2) approaches 
[a(t+) +a(t—) ]/2, we see that 


lim I, = =) | 


kw 2 


This completes the proof of the theorem. 


0 (i= 1,2,---). 
Hence 
or 


132 D. V. WIDDER [January 


7. Relation between the operators L and S. We have already seen that 
if 
(7.1) fa) = 
0 


then 


That is, we are able to determine ¢(¢) and its integral in terms of f(x). We 
are thus led to seek to determine the successive integrals and the successive 
derivatives (provided the latter exist) of ¢(#) in terms of f(x). It will appear 
in this section that the successive integrals of S;,+[ f(x) ] approach the cor- 
responding integrals of ¢(é) and that the successive derivatives of L:,+[f(x) ] 
approach the corresponding derivatives of ¢(#). We begin by proving 

THeoremM 10. If f(x) is any function such that Si4[f(x) | exists for every 
positive t, then 


d 
“sealf(e)] = 


almost everywhere in the interval (0, ©). 
By hypothesis f(x) must have derivatives of the first k+1 orders and the 
integral 


k 
k/t 


must converge for ¢>0. But this integral has a derivative with respect to ¢ 
almost everywhere in the interval (0, ©) equal to 


kit (+). 


so that the result is proved. 
If f(x) is defined by (7.1) where ¢(é) is integrable in (0, R) for every 
positive R, Theorem 2 shows that 


tim = ff “o(u)du. 


Theorem 10 shows further that the first derivative of S;,.[ f(x) ] with respect 
to ¢ approaches ¢(¢) almost everywhere. For 


=— 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


and by Theorem 4 
1 


We next establish 


THEOREM 11. If the function $(t) is integrable in (0, R) for every positive R, 
and if the integral 


fla) = f 


converges for some value of x, then 


(- 1) *+1 k m 
= lim f (4) ¢ du. 
koe k!m! k/t 


It is a familiar fact that the iterated integral (7.2) can be expressed as the 


t (t u)™ 


(7.2) 


single integral 


The integral on the right-hand side of (7.2) can be expressed as follows: 


(- 1)*+1 k\™ 
kim! 


k/t 


That the saseian on the right-hand side of this equality converge one sees 
at once by use of (3.10) for i=1, 2, - - - , m. For i=0 the convergence of the 
integral was already established in §3. Now let & become infinite in (7.3). 
We shall be able to show that 


(- 1)*+1 t 
(7.4) lim ————— ki fo) (u)u*-*du -f uip(u)du 
k! k/t 0 
and thus that 
(- 1)*+1 m 
k 


kim! It 


133 
(7.3) 
{ 


134 D. V. WIDDER ‘ [January 


This will clearly establish the theorem. To prove (7.4) we note first that 


k/t 


(k’+i)/t 
(k’+i)/t 


-f — f (4) du, 
k'/t k 


where k’ =k—i. But 
lim 
(k’ + 1)! k'/t 
as one sees by applying Theorem 2 to the function 


t 
+40 (4) du -f u‘d(u)du, 


0 


f(a) = (= nat 
0 
and by noting that 
1 (kh +i+ 1) 
k’! (k’ + i)! 
That f() =0 follows from (3.10). 
It remains only to show that 


lim I, = 0, 


ko 


where 


(y)du = 0. 


(k+i+ 
(k + i)! k 


/t 
Set 


a(t) = f $(y)dy. 
Then 


F(x) = f(x)/x -f e~*'a(t)dt, 


0 


and a(é) satisfies the inequality (3.4). Introducing the function F(x) we have 
(etre 
I, = (4) + i+ 1)u*F(*+*) (4) |du. 
(k + i)! k/t 


By integration by parts this becomes 


(k + i)! t t 


k k\ (k+i)/t 
— (-) (-) + if |. 
t t k/t 


(k+i)/t 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


If we apply Theorem 3 to the function 


(— 1)§F@(x) = f e-*'tia(t)dt 


and to the function F(x) itself, we see that the sum of the first two terms on 
the right-hand side of (7.5) approaches zero with 1/k. Finally by virtue of 
(2.3) we have 

(7.6) | FO (u)| M(k + i)!/(u — (u > 8), 


whence the third term of (7.5) becomes O(1/k) as k becomes infinite. The 
proof of the theorem is thus complete. 

We turn next to the problem of determining the successive derivatives 
of ¢(#) in terms of f(x). We first establish the following 

Lemma. If the function f(x) is of class* C* in the interval OSx<@, then 
a” 


(¢>0;",k = 0,1,2,---). 


a” 
(7.7) | = (— 1)" 


We prove this result by induction. For »=1 we have 
d 
af) +f), 


Lisl af(x)] = (— 1)* + (Rk + 1) 
On the other hand 


so that (7.7) is established for m =1. Suppose it is true for 0, 1, - - -,#—1. Re- 
write the left-hand member of (7.7) as follows: 


(7.8) Leal { |. 


If we replace » by m—1 and f(x) by 
d 
— 1)f(x) + —[zf(z)] 
dx 


in (7.7) we see that (7.8) becomes 


* That is, continuous with its first » derivatives. 


135 
d 


D. V. WIDDER [January 
d 
dx 
Applying (7.7) again, now with » =1, (7.8) becomes 


n—1 


d 


= 1) “1, 
= ( ae k,t f(x ° 


The induction is thus complete. 
By use of this Lemma we establish 


THEOREM 12. If the function $(t) is of class C* in the interval 0 S$t<@, and 
if the integral 


(7.9) ear 

0 
converges for some value of x, then the integral 
(7.10) f(x) = 
also converges for large values of x and 


da” 
—Li tL f(x) (¢> 0). 


= lim 
dl 


Since the integral (7.9) converges, an application of inequality (3.4) gives 
(7.11) | | < Mest (O<t<o), 
Integrating (7.9) by parts and using (7.11) we obtain 


f = — + x f (™—1) dt 
0 0 


for all values of x sufficiently large. Successive applications of this result will 
show that (7.10) converges and that 


= + xp" (0) + - - + +f (t) dt. 
0 
Differentiate both sides of this equation m times with respect to ~x, 


n =a 1)” (n) d 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 137 


and apply the operator L;,, to both sides of the resulting equation. This 
gives, when the result of the Lemma is taken into account, 


i” f@)] = Leal 


Take the limit of both sides of this equation as k becomes infinite, applying 
Theorem 4, and obtain 


d 


dt 


from which the result of the theorem follows immediately. 

8. The operator L applied to the Laplace-Stieltjes integral. We conclude 
Part I with a discussion of the effect of the operator L on functions f(x) which 
are defined by Laplace-Stieltjes integrals (1.1). We have already seen that 


a(t +) + a(t 


Si[f(x)] = 


We now show that L,[f(x) ] also exists for certain values of ¢. We prove 


THEOREM 13. Let the function a(t) be of bounded variation in the interval 
(0, R) for every positive R, and let it possess a derivative on the right a,'(to) and 
a derivative on the left a_'(to) at a point ts>0. Then if the integral 


f(x) e~*'da(t) 
0 


converges for some value of x, 


ats! (to) + (to) 


Lt, [f(x)] 


We note first that it will be no essential restriction to suppose that f)=1. 


For, set 
x 
g(x) = = = e~*"da(tou) 


where ¢=fou. Simple computation shows that 
L,[g(x)] = toL.,[ f(x) ] 


and that the derivatives on the right and left of a(tou) at w=1 are ty a,’ (to) 
and fy a_’(to) respectively. Hence if we have proved the theorem for t9=1 we 
have 


\ 
i 


D. V. WIDDER 


Li[g(x)] a+! (to) + a (to) 
to 2 


L,[f(x)] = 
Form the expression 


Lialf(x)] = 
! Jo 


The improper integral will converge if & is sufficiently large. It will be suf- 
ficient to show that the two integrals 
Rett 


h=— e*thd[a(t) — a! (1)8], 


Rett 


approach zero with 1/k, since we saw in §2 that 


Rett 


1 
lim f tke! (1)dt = 
0 


kw 


(1) 
2 
| 
kl Jy 


If we introduce the functions 
B(t) = a(1) — a(t) — a’ (1)(1 — 9), 
v(t) = a(t) — a(1) — af (1) — 1), 


we have 
Rett 


k! 


1 
f 
0 


Rett 
= e~*tt*dy(t). 
k! 


By noting that 
B(t) = o(1 — 2) (¢—>1-), 
y(t) = off — 1) ¢— 1+), 


and by use of the methods of §4, we see that 7, and J; approach zero with 
1/k. We omit the details. 

The proof may easily be extended to include the case in which a(#) has 
right-hand or left-hand derivatives that are infinite at fo. 


138 [January 


THE INVERSION OF THE LAPLACE INTEGRAL 


Part II 


THE REPRESENTATION OF FUNCTIONS AS LAPLACE INTEGRALS 


9. A preliminary formula. In Part I we dealt with functions f(x) which 
were known to be expressible as Laplace integrals. Here we shall abandon 
this assumption and assume only that the functions f(x) are such as to make 
the operators Z and S have meaning. This will lead us to certain uniqueness 
theorems regarding the representation of functions as Laplace integrals and 
to necessary and sufficient conditions for such representation. We begin with 


THEOREM 14. If f(x) is of class C* in the interval cSx< ©, and if the in- 
tegrals 


(9.1) f (k = 0,1, 2,---,2—1) 


converge, then 


1 k 1 
(9.2) ( FO) (x) = +—f u*(— (4) du 
where F(x) =f(x)/x. 
By integration by parts we obtain 


yk u* 


a» k 
+ ( (u)du. 


Since both integrals converge by hypothesis it follows that the limits 
lim x*f)(x) 


exist for k=0, 1, - - - , #—1. The existence of all of these limits implies that 
they are zero except perhaps for k=0. We prove this by induction. Suppose 
that 


lim xf’(x) = B>0. 


Then there exists a positive number x such that 


af'(x) > B/2 (x = %>0). 
Hence 


1934] 139 
4 


D. V. WIDDER 


B 
f f'(u)du > — 
2 2 u 


B 
f(x) > — log — + f(x). 
2 Xo 


As x becomes positively infinite the right-hand side of this inequality does 
also, so that f(~) can not approach a limit as the hypothesis demands. If B 
is negative we need only replace f(x) by —f(x) in the foregoing proof. 
Now suppose we have proved that 
lim x*f( (x) = 0. 


Then 
x 


approaches a limit by hypothesis. By the previous work this limit must be 
zero, whence 


lim (x) = 


This completes the induction and gives us the equation 


(- aye f pw 


(x)x* 


Successive application of this result gives (9.2). 
CoROLLARY 1. Under the conditions of the theorem, 
Lit[F(x)] = (0<t S k/c;k =0,1,---,n—1) 
if c>0. 
Coro.iary 2. If condition (9.1) is replaced by the condition 
(9.3) lim x*f)(x) exists (k 1), 


the result of the theorem remains true. 

To prove this we have only to show that (9.3) implies (9.1). 

10. Uniqueness theorems. We turn next to the proof of 

THEOREM 15. If the function F(x) is of class C® in the interval 0<x<@ 
and satisfies the inequalities 


M 
(10.1) | F(x) | < 


(7 >0;k =0,1,---), 


140 [January 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


then 


(10.2) lim s[F(2) Jat = F(x) (x > 0). 


0 


That the integral (10.2) exists for k=0, 1, 2, - - - follows at once from the 
inequalities 


| <M =0,1,2,---), 
| at| < = = (x > 0). 
0 x 


If we set 
kiz 
H,(x) = f .[F(x) ]dt, 
0 


/z 


0 


and it will be sufficient to show that 
lim H,(x) = F(x) 


lin Ii(x) =0 


By the change of variable u=k/t we obtain 
Hi(x) = (— 1)* f q wae. 
On the other hand we have 
(= 
2 (k — 1)! 
(— 1)*(% — 


— y)k-2 


The first term on the right-hand side of this equation is zero since 


lim F-)(u)u*-! = 0, 


as we see from (10.1). By successive integration by parts we see that 


141 
we have 

FORD (u)du 4 
)! 
i 


D. V. WIDDER 


Hence 


Again using (10.1), 


4 
| Hi(x) — F(x) | < ue f - (1 - 
z u? 


or, by the change of variable v=x/u, 
| Hi(x) — F(x)| < — (1 — do. 
We write 
et — (1 — = — (1 — — — 

and note that 

(1—v)* >0 1), 
as one sees by virtue Of the familiar inequality 

(x #0). 


Hence 


| He(x) — F(x)| < [e“** — (1 — 0)*]dv + o(1 — 
x 0 5 


1 Mk 
k k+1 I(k +2) 


= 

k+1 k+1 
From this inequality it follows that 


lim H;,(x) = F(x) 


for all positive values of x. 
For I,(x) we clearly have the inequality 


| < Me-*'dt = Me-*/x 


k/s 


142 [January 
k-1 
Mk 
(x > 0). 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


This shows that 
lim J = 0 


kw 
for all positive values of x, and the proof of the theorem is complete. 
We now prove a companion theorem for the operator S. 
THEOREM 16. If the function f(x) is of class C* in the interval 0<x<@, 
and if 


u® 
(10.3) | f i <M (x>0;k=0,1,---), 


then 
lim e~*'dS, (f(x) | = f(x) — f() (x > 0) 


where Sx,o[f(x) | is defined as f(~). 


The existence of the integral (10.3) for all & enables us to employ Theorem 
14 for an arbitrary value of k. Thus 


FO (se) 
k! 
k 
= s(co) + (— 18 f = 0,1,---), 


(—1)* 


where F(x) =f(x)/x. By (10.3) we have 
(10.4) | S| f(o)| + M=N (x >0;k =0,1,---). 


Hence we may apply Theorem 15 to F(x) and obtain 


lim Jat = F(x). 


koe 9 
By Corollary 1 of Theorem 14, 
(10.5) Le tl[F(x)] = (¢ > 0;% = 0,1, 2,---). 
The explicit expression for S;,+[f(x) ] shows that 
(10.6) lim = f(). 


An integration by parts, using (10.4), (10.5), and (10.6), gives 


1 
(10.7) f e~**L, ,[F(x) ]dt = + es f (x > 0). 
0 0 


143 
0 


144 D. V. WIDDER [January 


It is to be noted that we have not proved that S;,+[(x) ] is of bounded varia- 
tion in 0<¢<R. This is not necessary for the existence of the above Stieltjes 
integral. We have only to note that e~* is a function of bounded variation in 
the interval 0<‘<R for every positive x and R, and that S;,.[f(x) ] is con- 
tinuous in 0<¢<R (if defined as f() at t=0). Allowing k to become infinite 
in (10.7) we have 

f(x) f(*) 


1 
kw z 0 x x 


from which the result of the theorem follows immediately. 

Theorems 15 and 16 may be regarded as uniqueness theorems. The 
former shows at once that if F(x) and F.(x) are two functions satisfying the 
inequalities (10.1) and such that 
(10.8) L,|Fi(x)] = L.[F2(x)], 
then F(x) = F(x) for all positive values of x. For, the function 

= Fi(x) — F2(x) 
also satisfies inequalities (10.1). By Theorem 15 
(10.9) lim B(x) ]dt = 


0 


But by (10.8) 
lim = 0 
ko 


Since 
| | < eM, 


and since the function Me~* is integrable with respect to ¢ on the infinite 
interval (0, ©) for every positive x, we may take the limit under the in- 
tegral sign in (10.7) and obtain 
= 0. 
In a similar way we can show that if (x) is a function satisfying (10.3) 
and such that 
= 0 (t > 0), 


then (x) is identically zero for all positive values of x. For, by Theorem 16, 


lim ®(2) ] = — 


THE INVERSION OF THE LAPLACE INTEGRAL 


0 0 


and since 
| < N, 
we have, on allowing k to become infinite, 
(x) — B(0) = — 
The result is thus established. 
11. Bernstein’s theorem. By use of Theorem 16 we can now give a much 


simplified proof of a theorem of S. Bernstein.* As a preliminary result we 
prove 


THEOREM 17. If the function f(x) is completely monotonic in the interval 
(n>0), then 


(a) = f 


where the function a(t) is a non-decreasing function and the integral converges 
for x>0. 


We recall that a function is completely monotonic in an interval 
—n<x<o if it possesses derivatives of all orders there which satisfy the 
inequalities 


(— 1)*f(x) 20 (-n<x< 0), 
We prove first that the limits 
(11.1) lim x*f(*)(x) 


exist. The result is obvious for k =0 since a completely monotonic function is 
non-negative and non-decreasing. Now form the function 


f(x) — xf'(x). 


It is clearly non-negative for x >0 and has a non-positive derivative 


d 
—(f(x) — = — xf"(x). 
dx 


*S. Bernstein, Sur les fonctions absolument monotones, Acta Mathematica, vol. 52 (1929), p. 1. 
See also F. Hausdorff, S ti thoden und Momentfolgen, Mathematische Zeitschrift, vol. 9 
(1921), pp. 280-299, 


1934] 145 
But 


146 D. V. WIDDER [January 


It must therefore approach a limit as x becomes infinite, and hence the func- 
tion xf’(x) does also. We now proceed by induction. Suppose that the limit 
(11.1) exists for k=0, 1, - - - , m. We can prove that it also exists for k=n+1 
by considering the function 


(= 


re 


(11.2) f(x) — af"(x) + 


whose derivative is 
(- 1) x) /n!. 


The function (11.2) being non-negative non-increasing approaches a 
limit as x becomes infinite. All terms except the last approach a limit by as- 
sumption, so that this last must also. This completes the induction. 

By Corollary 2 of Theorem 14 we see that the integrals 


(k = 0,1, 2,---) 


converge for positive x. Hence the function 


uk 
(11.3) Sel f(x)] = + f (4) du 
k/t 


is well defined for all positive values of ¢ and for all positive integers k. We 
define the function as f(«) for #=0. Since f(x) is completely monotonic for 
x>0O, it is clear that the integrand of the integral (11.3) is non-negative. 
Hence S;,s[f(x)] is a non-negative non-decreasing function of ¢. Since the 
function f+» (z) is continuous in the neighborhood of the origin, we have 


k 
Skulf(x)] f(~) +f (OSt<~). 
0 
But this integral is independent of &. In fact 
k 
f = f(0) — f(~), 
0 


as one sees by successive integrations by parts. Hence 
0 s f(0) (t = 0; k= 0, 1, 2, ). 
We are thus in a position to apply Theorem 16, for the boundedness of 
S:,+[f(x) ] implies the condition (10.3). Hence 


= + tim f 
koe J 9 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 147 


Now by a theorem of E. Helly* it is possible to pick from the bounded se- 
quence of functions S;.[f(x)] (k=0, 1, 2,--- ) a sub-sequence S;,.[f(x)] 
which approaches a non-decreasing function B(é) as 7 becomes infinite. By 
the Helly-Bray theoremf it is permissible to take the limit under the integral 
sign so that we have 


f(a) = + im f = f() + 
Clearly 
B(0) = 2 0. 
Hence if we define 
a(t) = B(t) 
a(0) = 0, 


it is evident that a(#) remains a non-decreasing function and that 
fia) f (x > 0). 
0 


The theorem is completely established. We now prove the theorem of 
Bernstein: 


THEOREM 18. A necessary and sufficient condition that f(x) should be com- 
pletely monotonic for x>c is that 


(a) = f 
0 


where the function a(t) is non-decreasing and the integral converges for x>c. 


The sufficiency of the condition is established simply by noting that the 
function 


(— = f e-*4thda((t) 


must be non-negative for those values of x which make the integral converge 
if a(#) is non-decreasing. 

The necessity of the condition is easily established by use of Theorem 17. 
The important distinction between our present hypothesis and that of 
Theorem 17 is that here we do not know the function f(x) to be completely 


* Helly, Uber lineare Funktionaloperationen, Wiener Sitzungsberichte, vol. 121 (1921), p. 265. 
t See, for example, G. C. Evans, The Logarithmic Potential, Discontinuous Dirichlet and Neumann 
Problems, Colloquium Publications, vol. 6, of the American Mathematical Society, 1927, p. 15. 


(t > 0), 
| 
{ 

| 


148 D. V. WIDDER : [January 


monotonic outside the interval in which the Laplace integral is to converge. 
We note first that the function f(x+c+n) is completely monotonic for 
x>-—n. Applying Theorem 17 we have 


f(x +e+n) = f e~*'dB,(t), 


0 


where §,(¢) is non-decreasing and the integral converges for x >0. That is, 


0 0 


where 


a) = f 
0 


The integral (11.4) converges for x >c+7. This argument holds for each posi- 
tive value of 7 and appears to give various integral expressions for f(x) cor- 
responding to the various values of 7 used. But since a function can have but 
a single Laplace integral representation,* we see that a,(#) is independent of 
n and may be denoted by a(#). This is a non-decreasing function. Hence 


(a) = f “e-*da(t), 


where the integral converges for x >c+-7. Since n was arbitrary, the integral 
converges for x >c and the proof is complete. 

12. Representation by absolutely convergent Laplace-Stieltjes integrals. 
We are now able to give by the present methods a much simplified proof of 
a theorem of the author.t We state it first in the slightly less general form: 


THEOREM 19. A necessary and sufficient condition that the function f(x) can 
be expressed as 


(12.1) f(x) = f e~*'da(t) 


where a(t) is of bounded variation in the infinite interval OS x <@ is that f(x) 
should be of class C” in that interval and that 


* D. V. Widder, loc. cit., p. 705. We are assuming of course that the functions a,(#), 8,(é) etc: 
are all normalized. That is, 
a,(0) = 0, [ag(t +) + ag(t —)]/2 = (¢ > 0). 
1 D. V. Widder, Necessary and sufficient conditions for the representation of a function as a La- 
place integral, these Transactions, vol. 33 (1931), p. 851. 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 149 


yk 
(12.2) f | du (x >0;% = 0,1,2,---). 


Our present methods enable us to improve the proof of both the necessity 
and the sufficiency of the condition. We begin with the necessity. Suppose 
the function f(x) to have the representation (12.1), the total variation of 
a(t) in the interval (0, ©) being equal to M. Then the integral (12.1) must 
converge absolutely for x20. Denote the total variation of a(u) in the inter- 
val 0<u St by V(t). Then 0S V(t) SM for Set 


g(x) = f e~*dV(t). 
0 


| s(x) | = | | < = (— 1)*g(x), 
0 0 


and 
yk 
(k+1) (— k+1 —g(k+1) d 
(x>0;k = 0,1,2,---). 


Since V(#) is a non-decreasing function, g(x) is completely monotonic for 
x>0O. Hence the integrals (12.3) surely converge for x>0 as the argument 
used in the proof of Theorem 17 shows. Integrating the right-hand member 
of (12.3) by parts, we obtain 


© yk x? xk 
k! 2! k! 


~kgk 


x*t 


Again apply integration by parts to this last integral: 


yk 
f | | du f e~**V(t) dt (x > 0). 
0 


Finally, since V(#) S M we have 


u*® 
| dus a (x > 0). 


This completes the proof of the necessity of the condition. 
We turn next to the proof of the sufficiency. The condition (12.2) on 


| 
| 
Then 
{ 


150 D. V. WIDDER [January 


f(x) enables us to show that S;,,[/(x) ] is of bounded variation in the infinite 
interval (0, ©) and that the total variation of f(x) in that interval has an 
upper bound independent of &. For, let R be an arbitrary positive constant. 
Divide the interval (0, R) into sub-intervals by points ¢; such that 


Then we have 


| Sitesi [f(2)] f(x)] | 


t=O 


n—1 kilt; u* u* 
f —| du = f —| fD(u)| du < M. 
i=—0 ke/ k! k/R k! 
Since M is independent of the manner of sub-division of the interval (0, R) 
the function S;,+[f(x)] is of bounded variation in the interval and its total 
variation in that interval is at most M, a number independent of & and of R. 
Hence the total variation of S:,¢[f(x)] in (0, ©) is also at most equal to M 
for all positive integers k. 

Now by the theorem of Helly already employed in the proof of Theorem 
17 we can pick from the sequence S;,+[f(x) ] a sub-sequence S;,:[f(x) ] which 
approaches a limit a(#) defined for 0<#<, whose total variation in that 
interval is at most M. 

The condition (12.2) clearly implies the condition (10.3), so that we may 
apply Theorem 16 here to show that 


(12.4) f(x) — f(~) = f eas, 


Since 
| S| f()| + 
we may use precisely the same argument as that used in the proof of Theorem 
17 to show that we may take the limit under the sign of integration in (12.4). 
Thus 
f(a) = f(0) + (a(0) = s(@)), 
0 


or if a(0) is defined as zero, 


f(x) = “e-ttda(t). 


| 
J 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 151 


In either case a(#) is of bounded variation in (0, ©) and the proof is complete. 
We now prove the more general result by use of Theorem 19. 


THEOREM 20. A necessary and sufficient condition that f(x) can be expressed 
in the form 


(12.5) f(x) = 


where a(t) is of bounded variation in the interval (0, R) for every positive R, the 
integral converging absolutely for x>c, is that f(x) should be of class C* in the 
interval c<x< and that for every positive constant 6 there should exist a 
constant M, such that 


k 
(12.6) f te dus Ms (x >0;k =0,1,2,---). 


We prove first the necessity of the condition. Let f(x) have the form 
(12.5). Then 


(12.7) fete+s) = f 
0 


where 


B(t) = 
0 


The integral (12.7) converges absolutely for x > —6. The total variation of 
A(u) in the intervalO<uStis 


t 
f e~ (+5) (x), 
0 


where V(#) is the total variation of a(u) in that interval. Hence the total 
variation of in the interval O<u< © is 


M; -f e~ (+5) udV (um). 
0 


This integral converges since (12.5) converges absolutely for x=c+6. Now 
applying Theorem 19 to the function f(x+c+5) we get (12.6). This estab- 
lishes the necessity of the condition. 

Now assume that (12.6) holds. By Theorem 19 we have 


f 


0 


| 

| 
| 


152 D. V. WIDDER [January 


where A(t) is of bounded variation in the interval 0<i<«. This integral 
consequently converges absolutely for x20. Hence the integral 


f(x) = f 


ai) = f 
0 


converges absolutely for x=c+6. This argument holds for each positive 6. 
The function a(é) appears to depend on 6. This is not the case, however, as 
one sees by again appealing to the uniqueness theorem for the representation 
of a function by a Laplace integral.* Since 6 is arbitrary it follows that (12.5) 
converges absolutely for « >c and the proof is complete. 

It is natural to inquire what sort of condition is imposed on the function 
f(x) by (12.6) if the absolute value signs are removed from the integrand and 
are applied instead to the integral itself. In this connection we prove 


THEOREM 21. A necessary and sufficient condition that the function f(x) 
can be expressed in the form 


f(x) = e~*'h(t)dt 
0 


where $(t) is integrable in (0, R) for every positive R, is uniformly bounded in 
(0, ©), and is such that the limit 


(12.8) 


exists, is that a constant M should exist for which 


u*® 
(12.9) | f i <M (x>0;k=0,1,---). 


We first establish the necessity of the condition. Set F(x) =f(x)/x. If 
<N for 0<t<o, then 


(x 
| (x >0;k =0,1,2,---). 


(12.10) 


The hypothesis (12.8) implies that 


(12.11) ~ a (t 0) 


* The function 8(u) of course depends on 4. It is precisely this fact that makes it possible for 
a(t) to be independent of 6. 


where 
1 t 

lim — f $(u)du 

t0 0 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


for a suitable constant a. From this it follows that 
t 
f u*p(u)du ~ at*t1/(k + 1) 
0 
For, by virtue of (12.11), we have 


@ a] au = of 
If we set 
= k da 
B (t) J (u) — a] du, 
we must show that 
(12.12) B(t) = o(t*+) 


By an integration by parts (12.12) is easily established. 
This result enables us to show that 


(12.13) (— 1)*F(x) ~ 
or that 
= f — = 
0 


Integration by parts gives 


= f e~*'B(t)dt 
0 


Using the relation (12.12) on the first integral on the right-hand side of this 
equation we obtain 


(k +1)! 


| I, | = 


e+ f + a)t*+1dt (x > > 0). 


Hence it is evident that (12.13) is true. From this fact it follows at once that 


lim f(x) = a, lim x*f®(x) = 0 (k =1,2,---). 


| 
| 
153 
(t+ 0). 
(t0). | | 
if 
(x—> @), 
Bi 
(x —> 
a 
But 
Li 
re 


154 D. V. WIDDER 
By Corollary 2 of Theorem 14 we have 
(— 
k! 


u*® 
(x) = f(*) + (- f 


f <N+|f()| = M. 


This completes the proof of the necessity of the condition. 
Conversely, suppose that (12.9) holds. Applying Theorem 14 we have 


| < | f(o)|+M (x>0;k =0,1,2,---). 


Then applying a theorem of the author* we see that 
F(x) = = f e~*'g(t)dt, 
x 0 


where ¢(#) is uniformly bounded in (0, ©). It remains only to show that 
(12.8) holds. We know that f(«) exists since the integral (12.9) converges 
for k =0 by hypothesis. That is, 

F(x) ~ f(~)/x. 


Since ¢(¢) is bounded we may apply a familiar Tauberian theorem} and 
conclude that 


f ~ (10). 
0 


This completes the proof of the theorem. 


Part III 
APPLICATIONS 


13. Zeros of Laplace integrals. In this section we shall discuss the relation 
between the zeros of the integral 


(13.1) f(x) = 


and the changes of trend of the function a(é). In case a(é) has a continu- 


* D. V. Widder, Necessary and sufficient conditions for the representation of a function as a Laplace 
integral, these Transactions, vol. 33 (1931), p. 873, Theorem 13. 

¢ G. H. Hardy and J. E. Littlewood, On Tauberian theorems, Proceedings of the London Mathe- 
matical Society, vol. 30 (1930), p. 23. 


That is, 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 155 


ous derivative a’(#), E. Laguerre* proved that the number of zeros of f(x) 
in the interval of convergence of (13.1) can not exceed the number of changes 
of sign of a’(#). He obtained a similar result for the case in which (13.1) 
reduces to a Dirichlet series. We here extend the result to the general in- 
tegral (13.1). 

We first make a precise definition of the notion of change of trend. In this 
definition we use the term interval to include an interval of zero length. A 
function a(é) is increasing (or decreasing) in the zero interval (a, a) if 
a(a+)>a(a—) (or a(a+) <a(a—)). 

DEFINITION. Let a(#) be a normalized function of bounded variation in the 
interval 0<t<R. Then a(t) has n changes of trend in that interval if there exist 
points 

= R, 
with at most two consecutive t; equal, such that 
—) a(t; +) if 0, i: # R, 
(A) a(0)#a(0+) if to = t; = 0, 
a(R) #a(R—-) if =t=R; 
(B) a(é) is alternately increasing and decreasing in the intervals 
(to, (4, te), (tn; tn1)+ 


If a function has m changes of trend in (0, R) for every positive R suf- 
ficiently large, we say that it has m changes of trend in the infinite interval 
(0, ©). In particular if a(#) has a continuous derivative a’(#) then m is the 
number of changes in sign of a’ (#). On the other hand if a(#) is a step-function, 
so that (13.1) reduces to a Dirichlet series, » is the number of changes of 
sign in the sequence of the coefficients. We now establish 


THEOREM 22. If the function a(t) is of bounded variation in the interval 


(0, R) for every positive R, and if it has n changes of trend in the interval 
(0, ©), then the function 


f 


has at most n zeros in the interval of convergence of the integral. 
By the definition of change of trend it is clear that the function 


* E. Laguerre, Oeuvres, vol. 1, p. 29. 


4 
q q 
| 
| 
| 
ig 
q 
| 
i 
Be 
4 
3 
1 


156 D. V. WIDDER [January 


is monotonic in the interval (0, ©). Now form the function 


(13.2) f*(x) = — — b)---  — 
0 


0 


Since (¢) is monotonic it follows that f*(x) has no zeros in the interval of 
convergence of the integral (13.2). Set 


(¢ ti)(t = te) (¢ tn) a,(— t)" + + + ao. 
Then 
= anf (x) (x) + + aof(x). 
If f(x) had more than » zeros, this linear differential expression would have 
at least one zero, as one sees by a generalized form of Rolle’s theorem.f 
Since f*(«) has no zeros, our result is proved. 

By use of our inversion formula we can now get a more exact relation 
between the number of zeros of f(x) and the number of changes of trend 
in a(t). We prove 

THEOREM 23. If the function a(t) is a normalized function of bounded 
variation with n changes of trend in the interval (0, R) for every sufficiently 
large positive R, and if a(0+) =a(0) =0, then 


f(a) = (— 1)* f e-'tkda(t) 


0 


has exactly n changes of sign in the interval of convergence of the integral for 
all k sufficiently large. 


Before proving the theorem we point out that the restriction a(0+) =a(0) 
is a necessary one. If it were omitted, a(¢) could be defined as 


a(0)=0, a(t) =1 (0<#< 1), 
a(1) = 3, a(t) =0 (l<i<), 


a function with one change of trend. Yet the derivatives of 


f e~**da(t) 
0 


have no change of sign, no matter how high the order. 
To prove the theorem consider the points to, 4, - - - , 4, whose existence is 


T G. Pélya, On the mean-value theorem corresponding to a given linear homogeneous differential 
equation, these Transactions, vol. 24 (1922), pp. 312-324. 

D. V. Widder, A general mean-value theorem, these Transactions, vol. 26 (1924), pp. 385-394. 
Since the coefficients a; are constants, the property W is satisfied in any interval. 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 157 


guaranteed by the foregoing definition. Consider two adjoining intervals 
(t;-1,;) and (¢;, #41). Suppose that a(#) is increasing in the first and decreasing 
in the second. We show first that points &, 7, ¢ exist such that 


(13.3) a(é) < a(n) > a(S), SE<n<F S hy. 
We consider several cases. 
Case I. ¢1t;, t;ti41. In this case a(é) has at least one point of increase 
in ¢;_1StSé,;. Hence we can find £ and 7’ such that 
a(t) < a(n’) (4-1 SE <7’ Shi). 
Since a(#) has at least one point of decrease in ¢; S¢S#;,: we can determine 7”’ 
and ¢ such that 
a(n’’) > a(S) (4; Sn” S tus). 
Choose 7 equal to 7’ or 7”’ so that 


a(n) = a(n’), 
a(n) = a(n’’). 
Then clearly (13.3) is satisfied. 
Case II. ¢;1¥4;, t;=t:4:. Choose £ and 7’ as in Case I. By B of the defi- 
nition we see that a(t;.:+)—a(t;—) <0. Hence we can determine 7” such 
that 


+) + a(t; —) 
2 


> = 


Choose 7 as in Case I and ¢ =é;4:. Then (13.3) is satisfied. 

Case III. ¢;_1=#;, ¢:ti4:. The treatment of this case is similar to that of 
Case II and is omitted. 

It is to be noted that 4:~t since a(0+) =a(0). Hence ¢>0 if 7=1. 

Now choose a positive number € so small that 


a(t) < a(n) < +. 
By Theorem 2 we can determine an integer ky so large that 
| See[f(x)] a(t)| <e (t = k > ho). 
Hence 
Su elf(x)] < > Se slf@)]. 


Since S;,+[f(x) ] is a function of class C’ at least, it follows that it has at least 
one maximum in the interval ¢;_1<¢<t;,:, where its derivative vanishes. A 
similar proof applies if a(#) is decreasing in (é;-1, ¢;) and increasing in (f;, ¢:41). 


| 
| 
i 
| 
4 
rt 


158 D. V. WIDDER [January 


Since there are but a finite number of intervals (¢;, #:4:) we can determine 
ko so large that for k>k» the function S;,, will have at least one 
maximum (minimum) in (ép, 
minimum (maximum) in (f;, és), 


maximum (minimum) in (fe, ¢), 


It thus becomes clear that the derivative of S;,,[f(x)] with respect to ¢ will 
change sign at least m times in (0, ©). That is, the function 


k* 
(k — 1)! peat ty 


and hence also f‘*+(x), must change sign at least m times in (0, 0). By 
Theorem 22 we see that f‘*+»(x) must change sign exactly m times in that 
interval and the theorem is completely established. 


Coroiiary 1. If a(t) has a maximum (minimum)* at a point to, then for k 
sufficiently large f‘® (x) will have a change of sign at a point x, such that 


Let € be an arbitrary positive number. We must show that there corre- 


sponds a number fo such that f(x) will have a zero x, for which 
k 
— — th| <e (k > ko), 
Xk 


or that f(k/t) will have a zero in the interval t)-—¢e<t<t.+e. This follows 
from the inequalities 


Skt [f(x)] < Sx,t[f(x)] > Sk [f(x)] 
precisely as in the proof of the theorem. 
2. If 


f(x) = aye™* + + +---, 


O<A1 < lim An = ©, 


m— 


and if the sequence ai, a2, as, - - - has n changes of sign, f(x) will have exactly n 


* We do not require that a(#) should be continuous at fp. We must, however, have a maximum 
(minimum) in the strict sense. 


k 
lim to. 
— 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 159 


changes of sign in the interval of convergence of the series for all k sufficiently 
large. 
Corottary 3. If a(t) has np changes of trend in (0, R) and if 


lim mz = ©, 


then the number of changes of sign of f‘*(x) in the interval of convergence of the 
integral becomes infinite with k. 


Here we introduce the further 


Derinition. Let $(t) be integrable in (0, R). It has n changes of sign in that 
interval if the function 


ae o(y)dy 


has n changes of trend there. 

For example, if (—1)‘#(#)=0 almost everywhere in (é;, ¢:4:), the sign > 
holding for a set of positive measure (=0, 1, 2, - - - , 2), then the conditions 
of the definition are satisfied. 


Coro.iary 4. If o(t) is integrable in (0, R) and has n changes of sign there 
for every positive R, then the function 


0 


has n changes of sign in the interval of convergence of the integral for all k suf- 
ficiently large. 

This result includes as a special case a result which the author stated 
earlier without proof.* The increased generality of the present result is note- 
worthy. 

In case the function ¢(#) of Corollary 4 has its changes of sign at points in 
the neighborhood of which it is different from zero, our inversion formula 
enables us to locate the positions of the changes of sign of #(é) if we know the 
positions of the changes of sign of f(x). We first make exact the notion of 
change of sign at a point. 


DeriniTIon. Let $(é) be defined in the interval (0, R). Then it has a 
change of sign at a point t=t; (0<t;<R) if for all positive € sufficiently small 


— €) > 0, + €) < 0, 


o(t; — < 0, +) > 0. 
* See the author’s Proceedings article cited in the Introduction, Theorem 5. 


or 


t 
| 
‘ea 
a > | 
| 

54 

= 

¥. 


160 D. V. WIDDER [January 


We now establish 
THEOREM 24. If $(#) is integrable in (0, R) for every positive R, the integral 


(13.4) f(x) = f e-*'h(t)dt 
0 


converging for some value of x, and if p(t) has a change of sign at a point bh, 
then for k sufficiently large f(x) will have a change of sign at a point x, such 
that 


lim — to. 
Given an arbitrary positive e we wish to show that we can find an integer 
ko such that for k>ko the function f(x) will have a change of sign x; such 


that 


k 
<< 


or that the function f“(k/#) will have a change of sign between ¢)—e and 
tote for all k>ko. By Theorem 4 the function L:..[f(x)] approaches f(x) 
almost everywhere in_(0, ©). Choose a point 7 in the interval ty<i<t)+e 
and a point £ in the interval tp —e<¢<t¢p such that 


Jim Lixlf(x)] = o(8), 


Li»[f(x)] = o(n). 
But 
0, O(n) 0, < 0. 


Hence we can determine kp so large that for k>ko 


Lralf(2)] > >0, taal] <2 <0, 


or else 


Lizlf(x)] < <0, Li[f(x)] > > 0. 


In either case the continuous function L;,.[ f(x) ] must vanish between £ and 
n. But this function vanishes only when f“(#/t) vanishes, so that the theorem 
is established. We point out that the theorem could also be derived by use 
of Corollary 1 to Theorem 22. 


« 
‘ 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 161 


Coro.tary. If $(t) is continuous in (0, ©) and has changes of sign at the 
points t;, 


and at no others, then f‘® (x) has exactly n changes of sign for k sufficiently large 
at points 
> Xan > > ky 
and 
ko k 
Included in this Corollary is a result stated earlier by the author.* 
We now illustrate the theory by an example. Take 


o(t) = — a)(t — (0<a<b). 
Then 


x 


htt 
Simple computations show that 

tin = {(a + + 1) + + + 1)? — 40b(k + + 2)]*} /(208), 
= {(a + +1) — [(a + + 1)? — 40b(k + + 2)]*/?} /(208). 
These roots will be real if & is sufficiently large. It must be so large that 
k+1 4ab 

k+2~° (a+6)? 


The right-hand side of this inequality is less than unity if a+b, whereas the 
left-hand side approaches unity as k becomes infinite. It is clear that 


lim =—, lim 
This example shows that f(x) may have a smaller number of changes of 
sign than ¢(¢) for small values of k. For example, if a=9, b =10, the function 
f(x) has no zeros for k <359, has two zeros for k >359. 
As a further application of Theorem 23 we prove 


* Theorem 6 of the Proceedings article cited above. 


2 (a+6) ab 
f(z) = —-——— +=, | 
x3 x? x 
eid 
i 
a 4 


162 D. V. WIDDER 


THEOREM 25. Let the series 
fe) = 
m=1 


have radius of convergence R, and let the sequence of real coefficients ay, de, - - - 
have n changes of sign. Then the function 


= 
m=1 
will have exactly n changes of sign in the interval 0<2<R for all k sufficiently 
large. 

Set z =e~*in the given power series. We thus obtain a Dirichlet series con- 
vergent for x>log (1/R), to which we apply Corollary 2 of Theorem 23. Its 
coefficients have exactly m changes of sign. Hence its Ath derivative will have 
exactly m changes of sign in the interval log (1/R) <x < © for all & sufficiently 
large. Replacing x by log (1/z) we have the result stated.* 

We may also obtain a similar result concerning factorial series. 


THEOREM 26. Let the sequence a, a2, - - - have n changes of sign, and let the 
series 


anm!m! 


(13.5) fi{x) = a(a-+1)---(x+m) 


converge for x >0. Then it is possible to determine a number ly such that for any 
fixed 1>Iq the function fi(x) will have n changes of sign in the interval 
0<x< for all k sufficiently large. 


Since the series (13.5) converges we haveT 
(13.6) fi(x) = f 
0 


where 


(13.7) oft) = — 


m=1 


The series (13.6) converges for 0<i<~, or the series 


¥i(z) = 


m=1 


* Cf. Pélya and Szegé, Aufgaben und Lehrsdtze aus der Analysis, vol. 2 (1925), p. 44, No. 44. 
t D. V. Widder, I, p. 739. 


[January 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 163 


converges for 0<z<1. By Theorem 25 the function y,(z)z will have exactly 
n changes of sign in 0<z<1 if / is greater than some number /y. The same 
will be true of ¥,(z) and hence of ¢,(#) in 0<t<o. We have now only to 
apply Corollary 3 of Theorem 23 to (13.7) to obtain the result stated.* 

14. Special inversion formulas. We conclude Part III with several spe- 
cific inversion formulas which, although they are immediate consequences of 
the general theory, are nevertheless worthy of separate statement. The first 
has to do with functions of the form 


da(t 
fa) = f 


considered by Stieltjes. He gave a complex inversion formula when a(?) is 
an increasing function.t Our present methods enable us to give the following 
simple real solution of the Stieltjes problem. 

THEOREM 27. If the function a(t) is of bounded variation in the infinite in- 
terval (0, ©), and if 


© da 
(14.1) f(x) = 


then 


a(t +) + a(t — 


= silt 


(14.2) 


To prove this result set 
o(y) = f e~“tda(t). 
0 


Since a(é) is of bounded total variation in (0, ©), this integral converges 
absolutely for OS y<. It follows that the integral (14.1) converges for 
x>0O, that f(x) is analytic in the whole complex x-plane with the negative 
real axis removed, and that 


f(x) = rooney. 


This integral converges for x >0. But 
a(t +) + a(t —) 
2 


* Compare Pélya and Szegi, loc. cit., vol. 2, p. 51, No. 84. 
t See, for example, O. Perron, Die Lehre von den Kettenbriichen, 1929, p. 


= S; [o(y)] 


| 
ch 
| 
(t > 0), i 
+ 


164 D. V. WIDDER 


and 
¢(y) = L,[f(z)] (y > 0) 


by Theorems 2 and 3 respectively. Combining these two results we have 
(14.2). 
Another application of Theorem 2 to Dirichlet series is contained in 


THEOREM 28. Jf 
f(x) = + + 
lim \, = ©, 


2 


the series converging for x>c, then 


a, = lim (— f — (u)du 
where Xo is defined as zero. 
This follows since 
dn = — 


In a similar way we may obtain the coefficients of a series in powers of 1/x 
in terms of the function it represents. 


THEOREM 29. Jf 
a a a 
x 
the series converging for x>c, then 


==0 


This follows since f(x) can be represented as a Laplace integral* 


fiz) = f 


An application of Theorem 12 now gives the result. 
Finally, the coefficients of a factorial series can also be determined in 


* See for example, D. V. Widder, I, p. 728. 


[January 
where 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


terms of the function which it represents. 
THEOREM 30. If 


ay a2 


ao 


the series converging for x >c, then 


a, = { lim tog . 
t=0 


koe 


The proof is similar to that of Theorem 29 and is omitted. 


Part IV 
COMPLEX VARIABLE 


15. Generating function analytic at infinity. In previous work we have 
regarded the generating function f and the determining function ¢ as real 
functions of the real variable. Let us now suppose that both are functions of 
the complex variable. Set 


and write 


0 


We are still supposing that the path of integration is along the positive real 
axis. If the integral converges for some value of s we easily see by breaking 
¢(x) into its real and imaginary parts that 


L.[f(s)] = (x) 


for all real positive values of x. It is natural to inquire if this formula still 
holds when x is replaced by the complex variable z. In other words, will our 
inversion formula hold off the real axis? We shall be able to show that it 
holds in the half-plane x >0 if the function ¢(z) is analytic there. In the pres- 
ent section we shall assume in addition that f(s) is analytic at infinity and 
vanishes there. We recall that such a function can be expressed in the form 
(15.1). If 


an 


(15.2) > 


n=0 


then 


165 

4 
| 

j 


166 D. V. WIDDER 


(15.3) o(z) = > , 


n=0 n! 


and ¢(z) is entire.* 
We now establish 


THEOREM 31. If f(s) is analytic at infinity and vanishes there, then L,[f(s) | 
exists for all complex z and defines an entire function (2) such that 


f(s) = f “e"o(a)ds, 


the integral converging in some half-plane o>0.. 


Since f(s) has the representation (15.2), the series converging in some 
neighborhood of infinity, there exist numbers M and p such that 


|an| << Mp" (n = 0,1, 2,---). 
Simple computation gives 


(15.4) Liz[f(s)] > (m + ®)! 


< 1/p). 


It will now be shown-that the series (15.4), whose terms are to be regarded 
as functions of the two variables k and z, converges uniformly in the region 
|z| <1, k=ko, where / is an arbitrary positive constant and hp is a suitably 
chosen positive integer. We note that 


(n + k)! 1 2 n 


is a decreasing function of k for any positive integer m. Having chosen / arbi- 
trarily we choose the integer ko greater than /p. We thus have 


n=0 k*k! n! n=0 ko" ko! n! 


Since the convergent dominant series is independent of & and of z, the uni- 
form convergence of (15.4) is established. Consequently we may take the 
limit of the series (15.4) term by term as k becomes infinite for any fixed z 
whose modulus is S/ provided that the limit of the general term exists. But 


(a + R)! 
= 
ko k*k! 


It follows that 
* See, for example, D. V. Widder, I, p. 728. 


[January 


THE INVERSION OF THE LAPLACE INTEGRAL 


= lim Lealf(s)] = = 460) 


for |z| </. Since / was arbitrary, the theorem is completely established. 
Coro.tiary. Under the conditions of the theorem, 
Le el f(s)] = o(2) 


uniformly in any closed region of the z-plane. 


This follows at once from the fact that the series (15.4) converged uni- 
formly in z as well as in k. 

16. Determining function analytic in a half-plane. We now turn to the 
case in which the determining function is analytic in the half-plane. We shall 
show that in this case our inversion formula is valid throughout the half- 
plane. The result to be proved is 


THEOREM 32. Let the function (2) be analytic in the half-plane x>0 and 
let the integral 


f(s) = 


converge for some value of s. Then 
lim = 


untformly in any closed region in the half-plane x >0. 


It is to be noted that we are not assuming that ¢(z) is analytic at the 
origin. Thus our proof will apply to such functions as z-"””. This degree of gen- 
erality is reflected in a corresponding complication in the proof. 

We obtain at once the following integral representation of Li,.[f(s)], 


1 k k+1 
Liz[f(s)] = f e~ 


Here u is for the present a real variable. Later we shall alter the path of inte- 
gration and u will be a complex variable. Set z= pe*¥ and u=re**. Let D be an 
arbitrary closed region in the half-plane x >0. We can determine positive con- 
stants po, p: and ¥,<7/2 such that for all points z of D we have poXp<pu, 
| Now consider the function 


| e-“/*4/z| = (r/p) exp [(— cos (0 — y)]. 


As z varies in D and u varies along the positive real axis we have 


1934] 167 
| 
A 
be 
| 


D. V. WIDDER 


(r/p) exp [(— r/p) cos] < (r/po) exp [(— cos 


The right-hand side of this inequality is independent of the value of z in D 
and tends to zero as r becomes infinite or as r approaches zero. Consequently 
we can determine a positive number 6 <p» and a positive number A>, both 
independent of z in D such that 


(16.1) | e~ “ley /z | 


for r <6 and for r=A. 
We can now show that 


1 k k+1 a) 
(16.2) lim (=) f e—*leyko(u)du = 0, 
4 


kaw k!\z 


1 k k+1 6 
(16.3) lim —(—) f = 0 
0 


koe k!\ 


uniformly in D. Set 


f 


for positive real values of wu. Then constants M and g exist such that 
(16.4) | a(u) | << Meo (0s u<o), 


Hence 


=) f a(A)— (=) 


1 k k+1 u 
(k—1)!\z2 4 


for k sufficiently large. The first term on the right-hand side of this equation 
satisfies the inequality 


p = | a(A) | 7 Po’; 


A k 
| | | — | 
where / <e—!, as one sees by (16.1). The dominant function is independent of 
z in D and approaches zero with 1/k. 
The second term on the right-hand side of (16.5) is in modulus at most 


equal to 


1 
f alu) | e~ 4 +— 
(k—1)! A | ( )| Po 


where kp is chosen so large that the integral converges. This is possible by 


168 [January 
1 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 169 


(16.4). The dominant function is again independent of z in D and approaches 
zero with 1/k, so that (16.2) is completely established. In a precisely similar 
way (16.3) is established, an integration by parts being necessary since 
|(u)| need not be bounded in (0, 5). However, it is unnecessary to use (16.4) 
in the present case since a(u) is continuous and hence bounded in 0<u<6. 
We omit the details of the proof. 

It remains to show that 


1 k k+1 4 
(16.6) lim f = $(z) 
uniformly in D. To establish this point we must alter the path of integration. 
We replace the segment of the real axis by a curve C composed of three arcs 


Ci: r=6 
C2: 6=y (6SrsA), 
C3: r=A 


We have here supposed that y=0. If y <0, replace the inequalities defining 
C; and C; by y <@<0. Since the integrand of (16.6) is analytic in the region 
bounded by the line segment 6<u<A and the curve C, we do not alter the 
value of the integral by changing the path of integration as indicated. It is 
to be noted that the curve C changes as z varies in D. 

We now show that 


1 
lim ~(—) f e~kuleyke(u)du = 0 (i = 1, 3) 
kaw zg Ci 


uniformly in D. We prove it for i=1, the proof for 7=3 being similar. We 


have 
1 /k\* 
f 


1 


(-) f cosy] k+1 | | do 
Po 0 


< — | do. 
po Yo 
This upper bound is independent of z in D and approaches zero with 1/k. 
We have thus reduced our problem to that of showing that 


lim (=) f ep ket dr = $(z) 


kaw k!\ zg 


4 
a | 
Al 
> z 
& 
a 
< 


170 D. V. WIDDER 


uniformly in D. But by Theorem 6 


pett 4 A 
lim —— f e~*rlegkdy = 1 
pttid, 


uniformly* in the interval pp Sr Sp. Hence we must show that 
1 
kl p*tt 


uniformly in D. Make the change of variable v=1/p. The integral becomes 


— $(z)|dr = 0 


Alp 


Given an arbitrary positive ¢ we shall show that we can determine a number 
k, independent of z in D such that 


| 1(k)| <e (k > ky). 


We first observe that 
4/po 
| I(k) | < f | — | dv. 
k! 


Since ¢(z) is uniformly continuous in z we can determine a number ¢ such 
that 


| o(2’) — 4(2”)| < «/3 <9) 
provided only that z’ and 2” are in D. Set »={/p1. Then if | 1—v| <n, we have 

|z — < pin = ¢. 
Hence if z is in D and |1—»| <7 it follows that 
(16.7) | — (vz) | < €/3. 

We now have 
| | S + + 
Rett 


=—— | **v* | (02) — $(z)| do, 
k! 8/p1 


where 


and where J,() and J;(k) are similar integrals with intervals of integration 
(1—n, and (1+7, A/po) respectively. 


* Take the function ¢(#) of Theorem 6 equal to zero for OS#S6 and for A<t< ~, and equal to 
unity for 5<t<A. Since 5<po<pi<A, the interval (po, 1) is an interval of continuity of ¢(é). 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


By (16.7) 
ktl ¢ 


€ 
I2(k)| S e~**ykdy < —- 
| 3 


If zis in D, and if 6/p:S0<A/po, then vz surely lies in the closed region 
Sp <piA/po, |¥| 


Denote the maximum of | ¢(z)| in that region by NV. Then 


(16.8) | 


(16.9) | | S 


If we determine ; so large that the right-hand members of (16.8) and (16.9) 
are each less than ¢/3 for k>k, we have 


| 1(k)| <e (k > ki). 


This completes the proof of the theorem. 

17. The zeros of the determining function. The applicability of our in- 
version formula in the complex domain enables us to extend the study of the 
zeros of the determining function made in Part III to the case when these 
zeros are no longer on the real axis. We first take the case in which the gener- 
ating function is analytic at infinity and prove 


THEOREM 33. Let the function f(s) be analytic at infinity and have the repre- 
sentation 


f(s) = f 


If $(z) has n zeros not at z=0 in the region | 2z| <1, then there exists an integer ky 
such that has n zeros not at z= in the region |s| >k;/l for k= hi. 


For, suppose that in addition to the m zeros of ¢(z) which are not at the 
origin there are m zeros at the origin. Then ¢(z) has m+ zeros in the region 
|z| <l. By the Corollary of Theorem 31 


uniformly in the region |z| </. Hence by a theorem of E. Rouché* there will 
exist an integer k; such that for k=, the function L;..[f(s)], which is surely 


* See for example Pélya and Szegi, loc. cit., vol. 1, p. 122, No. 194. 


171 + 

— 5) 20, 
4 

a 

| 

# 


172 D. V. WIDDER [January 


analytic at z=0 if suitably defined there, will have exactly +m zeros in tle 
region |z| </. If f(s) and ¢(z) have the power series developments (15.2) 
and (15.3), then 
= dm1= 0. 

Consequently the function f‘*(s)s*+! will have m zeros at infinity and 
Li.2|f(s)] will have m zeros at z=0 and hence m zeros in the region |2| </ 
not at z=0. But f‘®(k/z) has the same zeros as L;,.[f(s)] except at z=0. It 
follows that f‘*)(s) has m zeros not at z=0 in the region |s| >k/124:/l. This 
completes the proof of the theorem. 

If f(s) is no longer analytic at infinity we are not at liberty to suppose 
that ¢(z) is entire. If ¢(z) is analytic in the half-plane x >0 we may still ob- 
tain results concerning its zeros. We prove 


THeoreM 34. Let the function ¢(pe**) be analytic in the half-plane —x/2 
and let 
sls) 
0 


the integral converging for some value of s. If ¢(pe*¥) has n zeros in the region 
< < 2/2, 
then for all k sufficiently large f(s) will have n zeros in the region 
k/p2 <p < k/pi,|¥| < ve. 


By Theorem 32 we have 
(- 1)* k k k+1 
lim = tim = 
ko @ k! Zz 
uniformly in the region 
pi Sp S Sve. 


By Rouché’s Theorem there exists a number &; such that for k2 4, the func- 
tion Lx,.[f(s) ] and hence also f‘*)(k/z) has exactly m zeros in the region 


pi<p<p2,|¥| < 


We obtain the result of the theorem by replacing z by k/s. 
We turn now to the complex analogue of Theorem 24. 


THEOREM 35. Let the function $(z) be analytic in the half-plane x>0, and let 


f(s) = 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 173 


the integral converging for some value of s. If o(2) has a zero at a point zo, x» >0, 
then for k sufficiently large f‘*(s) will have a zero at a point s;, such that 


Let the zero Zo of #(z) be of order A. With 2 as center describe a circle 
of radius 7 so small that no other zero of $(z) lies in it and such that the circle 
lies entirely in the half-plane x >0. Then by the Theorem of Rouché we can 
determine an integer /; so large that for k >, the function f‘” (k/z) will have 
have exactly \ zeros (distinct or coincident) inside this circle. Denote any 
one of them by z;. Then f‘*(s) will vanish at the point s,=k/z;. Then 


k 
Sk 


< 


Since 7 was arbitrarily small it follows that 


In a similar way we could show that if f(s) is analytic at infinity and if 
$(z) vanishes at a point 2» not the origin, there would exist a zero s; of f‘”(s) 
for k sufficiently large such that k/s, approaches 29 as k becomes infinite. 
We illustrate the result by several examples. The first example following 
Theorem 24 gives us an interesting example of the present theory in case 
a=6>0. Then ¢(#) has a zero at a but no change of sign there, so that the 
real theory fails. We find that f‘*(s) has the complex zeros 


Sea = [(k +1) + i(k + 
Seo = +1) — i(k + 


Hence f‘*(s) will have no real zeros however large k may be. But it will have 
two complex zeros for all k and 


(j = 1, 2). 


The same analysis holds if a <0 or if a is complex, and provides an example 
to illustrate the case in which the zeros of ¢(z) may be in the half-plane x <0. 
Of course f(s) is analytic at infinity. 


Sk 1 
lm— =—- 
k Z0 

lm — = Zo. * 

kw 

A 

a 

H 

a 

1 

lim — 4 
kao a 

a 


D. V. WIDDER 


Part V 
THE MOMENT PROBLEM 


18. The problem. In this part we shall consider the infinite system of 
equations 


1 
(18.1) -f nda(t) 
0 


where a(t) is of bounded variation in the interval (0,1) and a(1) =0. The rela- 
tion of this system of equations to the integral equation (1.1) becomes evi- 
dent if the substitution ¢=e-“ is made: 


m= f ed { — a(e*)} (n = 0,1, 2,---). 


Here the variable m, running through a discrete set, replaces the continuous 
variable x of (1.1). We should expect then to be able to determine the func- 
tion a(é) in terms of the sequence {y,} by use of an operator similar to (3.1), 
(3.2), replacing the derivatives of f(x) in that expression by the differences of 
{un} and the integral sign by a summation sign. We shall find that the ex- 
pected analogy is complete and that we can also obtain in a similar way 
an operator analogous to L,[f(x) | which, when applied to a sequence 


1 
= f io(t)dt, 
0 


will yield the function ¢(?). 

19. An extension of the Laplace method. In the following section we 
shall need an extension of the classical Laplace method for the asymptotic 
evaluation of a definite integral. We state the result that we shall need as a 


Lemma. Let the function h(x) be of class C” in the interval aSx<b and 
satisfy the conditions 
h'(b) = 0, h’"(b) < 0, h(x) < h(b) (asx<b); 


let the functions $2(x), - be of class C’ inasxsb and satisfy the con- 
ditions 
| d(x) | 
| < M =1,2,---), 
| = 1 (k = 1,2,---), 
where W(x) is integrable in (a, b) and M is a constant independent of k. Then 


b 1/2 
~ 


174 [January 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 175 


This reduces to the classical result if the functions ¢;(x) are all equal to 
a single function ¥(x). Since the proof is much the same as in the classical 
case we omit it here.* 

20. A preliminary limit. We begin with a consideration of the limit of 


= i+hk)! . ki 

= > - — (» = 0<h< 1) 
i=n+1 ilk ! 1— t 

as k becomes infinite. Here, as in the remainder of the paper, the notation 

[uw] means the largest integer contained in u. Taylor’s series with exact re- 

mainder for the function (1—x)-*-! gives 


(k+ i)! (to — 2) 
0 


Hence 


(nt+k+1)! (t— x)" 


nik! (1 — 


H,(t) = (1 — to) 


x, 


or, if we set u=(to—x)/(1—2), 


k+1)! 
(20.1) Hilt) = — u)*du. 


Let us first consider H(t). Set 
Vo = bo/(1 — to), a = hog — n. 
Then a depends on & and satisfies the relation 


Since 
n'k! 


0 (n+ k + 1)! 


we have 


1 


f {uro(1 — u)}* u-adu 
Hx(to) 


— u)}*u-edu 
0 


By use of the Lemma of the previous section we shall show that this quantity 
approaches unity and hence that H;(t)) approaches 1/2 as k becomes infinite. 
We first obtain an asymptotic expression for the integral 


*See, for example, Pélya and Szegi, Aufgaben und Lehrsitze aus der Analysis, vol. I, 
p. 80, problem 212. 


a 
— 4 
4 
a 
4 


176 D. V. WIDDER 


(20.2) u)} *u-adu. 
0 


Let 6 be any positive quantity less than to. Then 


— u)}*u-*du < {50(4 ay f (1 — u)du 


0 


6 


20.3) °° 
(20.3) = {to(1 — t0)}*O(8*), 


where £ is a positive constant less than unity and independent of k. We now 
apply the Lemma to the integral 


to 
f — u)}*u-adu. 


For the application we have 
h(x) = v9 log x + log (1 — 2), 
h'(x) > O(x < t), 
= 0, (to) = — — bo)? < 
~ = | of (x)| < 1/62 = M, 
= to * > 1. 


Hence we conclude that 
to 


By virtue of (20.3) we see that the integral (20.2) has this same asymptotic 
expression. Similar reasoning shows that 


f { u%(1 —u) tor (1 *+1( /(2h)) (k 


so that 
lim H x(to) = 3. 


Next consider the case t>¢y. Set 
v=t/(1—?). 


Since the function u°(1—x) is increasing in the interval 0<u<t (with its 
maximum at u=t>4t)), it follows that 


(January 


THE INVERSION OF THE LAPLACE INTEGRAL 


— u)*du< { to°(1 to) } #1, 
° + k + 2) 
< T(k + 1)T(ko + to) 


By use of Stirling’s formula we can show that the right-hand side of this in- 
equality approaches zero with 1/k. Thus 


k k 1 vk+k+1 k vk 1/2 
lim H,(t) = lim (<) (<) { to°(1 
kw k vk 2krv 


The function of & on the right may be regarded as the &th term of an infinite 
series whose test ratio is 
v+i1\? 
( ) (v+ 1)to(1 — 


We can show that this is less than unity. Introducing ¢ and setting t—t,=7 
we must show that 


(20.4) (1 ) <1. 
i 


Employing a familiar inequality we have 


from which (20.4) follows at once. Hence the series in question converges 
and 


lim H,(¢) = 0 (t > to). 


ko 


Finally, if 0<t<to, we write 
k 1)! ¢} 
1 — H,(t) = | — u)*du, 
nik! to 


T(vk + k + 2) 1 


We treat the right-hand side of this inequality as before. We are again led to 


1934] 177 

| 

(1 *) <1—y, 

t 

1—t 

<1+n, 

1—t? 

n ( n 

1——})(1+—— <1-—77? <1, 

( 3 


178 D. V. WIDDER [January 


prove (20.4), but now 7 is a negative quantity greater than —1. The inequali- 
ties employed are not affected by this change so that 


lim [1 — H,(t)] = 0. 


Consequently we have proved 
THEOREM 36. If 


(i+ hk)! 
Ht) = > 


i=n+1 


k 1)! ¢ 
= | u)*du, 
0 


n'k! 


to*(1 to) k+1 


3 to), 
0 1). 


lim H,(é) = 


(0<t<%), 


We have really solved here the system of moment equations 


1 
= t"(t)dt 
J 


and found that 


n+k-+ 1)! 
¢(t) = lim 1)*A*un, 
kw nik! 
21. Inversion of the general sequence of moments. We now introduce the 
following operators. 


DEFINITION. An operator S:{un} is defined by the equations 


= i+ k)! 


(- 
i=n+1 ilk! 


Sif{un} = lim - 


DEFINITION. An operator L;{ un} is defined by the equations 
(n+ k+ 1)! 
nik! 
Li{un} = lim un} - 


un} (— 1)*A* un, 


We shall now show that S, { Mn } inverts the sequence (18.1). 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 179 


THEOREM 37. If the function a(t) is of bounded variation in the interval 
(0,1) with a(1) =0, and if 


1 
0 


Si {un} = (0<t<1). 


An integration by parts gives us 


We first show that L,{v,} is well defined. We have 


(n+k+ 1)! 


"(1 — t)ta(t)dt, n= | 


Let ¢) be an arbitrary point in the interval 0 <#<1. Set 
—1 (0<t< &), 


0 <1). 
Form the function 


V(t) = g(t) {alto +) — alto —)} + +), 
so that the function 
o(u) = a(u) — ¥(u) 
has the property that 
o(to +) = —) = 0. 
Now set up the integral 


(n+k+1)! kto 


We divide the interval of integration into three parts (0, 1—7), (1—7, 1+7), 
(1+ 7, 1), denoting the corresponding integrals by J;, J2, J; respectively. 
Given an arbitrary positive e€ we determine 7 so small that 


| o(y) | =| a(y) — ¥(y)| < €/3 (0<|y—%t| <7). 


| 

| 
Then 

Mn 

ff t"a(t)dt (n = 0,1, 2,---). ‘ia 

n+1 0 

], 

1 t 


180 D. V. WIDDER 


Then we have 
(n+k+1)! 
nik! 


n k 1)! 
< — (1 — + M, 
nk! 


! 
Nik! 


2, = 3. 


where M is a suitably chosen constant. We have already seen in §20 that the 
right-hand members of the last two inequalities approach zero with 1/k, so 
that it is clearly possible to determine kp so large that 
| I | <e 
for k>ko. But 
= 
(21.1) (n+k+1)! 


+ { a(to +) — alto —)} f "yn(1 — y)*dy — a(to +), 
0 


so that by allowing k to become infinite we obtain 
a(to +) a(to 
2 
a(to +) + a(to —) 
2 


0 = + +), 


Li {vn} = 


To evaluate the limit of the second term on the right-hand side of (21.1) we 
have employed Theorem 36. 
It remains to show that 
= Set{un} 
To do this we prove first that 


7 k 1)! 7 k 1)! ¢? 
ino (R + 1) Ii! ive Jo 


and that yz, exists. Introduce the function 


w(t) = a(t) — a(i —) (0s#8 1). 
Then 


1 1 
Tin = f t+1(1 — t)*da(t) = f t+1(1 — #)*dw(t) = 0,1, 2,---). 
0 0 


[January 


1934} THE INVERSION OF THE LAPLACE INTEGRAL 


In particular 


= J titIda(t) = j 


1 
=-—a(l—-)— (i+ nf Hw(t)dt (¢ = 0,1, 2,---). 
0 


Given an arbitrary positive e, we determine a number 6 such that 


| w(t) | < «/2 (1-65 <#<1). 
Then 


1 1-8 ¢ 
(i + 1) (i + 1) + G+ 1) 


1-8 
S (1 — 6)? + /2, 


where WN is an upper bound for |w(é)| in (0,1). We can now determine i so 
large that is less than €/2, so that 


Ha = —a(i—). 
Next consider 
1 1 1 
lin = ti+1(1 — t)*dw(t) = = — (i +1 t1B(t)dt, 
(1 — *de(t) Bt) = — (i+ 


where 


A(t) = — x)*dw(x) (0s#<1), 
B(1) = 0. 


Since B(#) is continuous at ¢=1, it follows that 
B(t) = o((1 — #)*) (¢—> 1). 
Hence it is easily seen that 
= of ( + vf — = (i> 0). 
Consequently we have proved that 
(GG+k+1)! 
(+ 
With this fact at our disposal we shall be able to show that the series 


(¢+ 2+ 1)! 
(21.3) te + a + 1)! ( 1) 


(21.2) — =0 (k = 0,1,2,---). 


181 
i 
| 
4 
| 
4 
‘al 
Al 


182 D. V. WIDDER 


converges and has the sum 


(21.4) 


i=0 iln! 


We proceed by induction. For k=0 the relation reduces to 


= Un+1- 
The series converges since yw; is known to approach a limit as i becomes in- 
finite, and partial summation shows the equation to be true. Now assume 
that (21.3) is equal to (21.4) and prove that the same equation holds when 
is replaced by k+1. Phen 
k+l (m + i)! (¢+k-+ 1)! 

nii! ( fet Ri + 1)! ( ) 

(n+k+1)! 

(k + 1)In! 


(21.5) 
(— 1)*HAF 


We observe that 


G@+k+2)! 
kG+1)! 


and apply partial summation to the right-hand side of (21.5). By virtue of 
(21.2) we thus obtain 
+ i)! (t+k+2)! 


= Keo + (— (k + 1) + 1)! 


k+2 


Bi+1- 


The induction is complete. 
But (21.3) is Also (21.4) is For, 


i 


(n+k-+ 1)! 


nik! — 1)*A*y,, 


k(n + 1)! 


i!n! 


[January 

Hence 


THE INVERSION OF THE LAPLACE INTEGRAL 


a(t +) + a(t —) 
2 


Sil un} = lim Hn} = 


and the proof of the theorem is complete. 
In the course of the proof we have demonstrated 


THEOREM 38. If the function $(t) is of bounded variation in the interval 
(0,1) with o(1) =0 and if 


= (n = 0,1,2,---), 


+) + o(t —) 


0<#<1). 
( ) 


Li{ un} 


We next establish the stronger result contained in 
THEOREM 39. If the function (t) is integrable in (0,1) and if 


1 
(n = 0,1,2,--*), 
0 


then 
{ un } = ¢(t) 


almost everywhere in (0,1). 
As we observed in §4 


60 | du = o(| 


for almost all values of ¢ in (0,1). Let #) be such a value of ¢, and form the 
integrals 


(n+k+1)! ¢! kt 
Ji = — y)*{o(y) — o(to)}dy, | | 
n\k! to 1 — bh 
It will be sufficient to show that J; and J; approach zero with 1/k; for, 


(n+ k + 1)! 


Leto {un} nik! 


f y"(1 — y)*o(y)dy, 


Lig{un} = + lim {Te + Js}. 


Since the proof is similar for the two integrals we give only that for J;. Set 


| 
1934] ee 183 4 
| 
then 
4 
4] 
Py 
4 


D. V. WIDDER 


t _ to 


It is easy to see by integration by parts that y(t) is also o(t)—t) as ¢ ap- 
proaches fy. Hence to an arbitrary positive ¢ there corresponds a number 6 
such that 

(21.6) |-v(y)| < (to— <y 
Introduce the integrals 


(ntk+1)! 


cit 


n'k! 


Ik! y(1 — y)*{o(y) — (to) }dy, 


so that 


kto | kto ] 
a= = 
1 — bo 1 — 


f — — (to) }dy, 


o—8 


If we write 


we have 
_(nt+k+1)! 


n'k! 


n\k! to 
(1 — y)*y*| 6(y) — o(¢0) | dy 


— (to — 5)(to — 5) (1 — to + 5)* 
to 
to—8 
Noting that the function 
yy te) (1 y)* 
is increasing in the interval (t)— 6, to) and applying (21.6) we obtain 
nk} 


€ 
— 85) (1 — 5)* 
| ) ( o + 4) 


€ to € 1 
0 


If | < 


184 [January 
€ fo 
+ <f (to — y)d { y¥tol (4 y)*} 
2 to—8 


THE INVERSION OF THE LAPLACE INTEGRAL 


1934] 


We turn next to J’. It may evidently be written as 


(n+ k-+ 1)! 
nk! 


0 


Let ky be an integer so large that 


Roto 
1 — to 


> 


Then 


(0< ys 1), 


koto! y-a(1 — < 1 


and 


nik! 
(n +k +1)! 


The function 


0 


sy tol te) (1 — 


is increasing in the interval (0, fo), so that 


n+k-+1)! 
| TZ’ | < ( ) (to §) tol (to) — bh + 5)*-*oM, 


n'k! 


where 


ty-8 
we | — (to) | dy. 


But the right-hand side of this inequality tends to zero with 1/k, as we proved 
in §20, so that we can determine f; so large that 


| | <¢/2 (k > hi), 


and 


| Ze | <e. 


The theorem is thus established. 
As illustrations of Theorems 39 and 41 it is interesting to show by direct 
evaluation of the limits concerned that 


1 
L 1 
t (0<t< 1), 
O<t<1 
). 


22. Uniqueness theorems. We are now in a position to establish 


| 
185 

: 

| 


186 D. V. WIDDER 


Tueorem 40. If the sequence {un} satisfies the inequalities 
Mn!k! 


("= 
(n+ k-+ 1)! 


(22.1) | | 


then 
1 
= lim t™L un} dt (m = 0,1, 2,--- 


kw 0 


By definition of the operator Li,.{un} we have 


= = f 


1 (n+k 1)! 


where 


+O 


H,(m) = (— A*y,dt, 


(22.2) m/ (k-+m) nik! 


m/(k+m) ! 
Ji(m) = (- yf 
0 


so that 
I,(m) = Hi(m) + Ji(m). 
Our theorem will then be established if we can show that 
lim Hi(m) = um, lim Ji(m) =0 (m=0,1,2,---). 


ko 
In the integral (22.2) make the change of variable 
u = kt/(1 — 2), 
so that 
k k 1)! 
k+u/ (k+u)? [u]!k! 


(22.3) Hi(m) = (— 


Next we observe that 
(p+k—m-—1)! 
22.4 (— 1)*Atue. 
(22.4) (k — Dip — mk uy 


This result is easily proved by induction, making use of (22.1). 


[January 
fer 
n=|—— |. 
1-—t 
Set 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 


Now write the summation (22.4) as an integral as follows: 


([u] + & — m — 1)! 
(22.5) Bm = (— (k — 1)'([u] — m)! 


Then 
k ({u] + &+ 1)! 
— 1)!([u] — m)! 


By virtue of (22.1) 
k fu] [ul —1 
[ul-m+1 k+u 
u [uJ +k+1 [ul] +e [uJ +k—m 


du. 


= 1 + = 14 0(— 


uniformly for m<u<. Also 


1| = (j = 0,1, 2,---,m— 1), 


where N is a constant independent of u and of 7. Hence for m=1 we have 


k 1 
(k—> 


so that 
(22.6) lim Hi(m) = pm (m = 1, 2,3,---). 


For m=0 we have from (22.3) 
=. (9+ 1)! k 
0) = (— 1)* > 
= 1)! 
(k — 1)!p! 


du 


=(- 1)! 


187 
— 

- | 

| 
But | 
| 
| 
| 

| 


188 D. V. WIDDER [January 


But reference to (22.4) shows that this is wo, and (22.6) holds also for m=0. 
It remains to show that 


(22.7) lim J;,(m) = 0 (m = 0, 1, 2,°°° 


kw 


By virtue of (22.1) we have 


m|(k+m) Mm 
|Ji(m)| f i™Mdt < 
0 k+m 


so that (22.7) is evident. The theorem is thus established. 
We turn next to a corresponding result for Stieltjes integrals. 


THEOREM 41. Jf 


(p+ h)! 
(22.8) 


aty,| <M (m = 0, 1, 2,--- ; 


then 


1 
Um — = lim un} 
0 


provided Si: {un} is defined as 


To prove this result we rewrite the sum (22.8) as follows: 


Now 


(+H! 


1 (p+k-—1)! 


for any integer g>m. Changing the summation variable in the last of these 
sums we have 


(p+tk— 
— 1)! 
k)! 


kip! 


= (- 1) 


). 

p=m 


THE INVERSION OF THE LAPLACE INTEGRAL 


k)! 
= (— 1)! >> (e+)! 
pam+1 kip! 

q+k+1)! 

+ ( ) kim! ( ) kg + 1)! Me+2 
Now let g become infinite. Since the summations on both sides of the equation 
approach limits by hypothesis, it follows that 


A 


exists (k=O, 1, 2, - - - ). We can show, in fact, that all these limits are zero 
except perhaps that corresponding to k=0. For, suppose 
lim gAu, = B>O. 
q 
Then for g=qo we have 
gAu, > B/2, 


Hg tn+1 = Au,> 7? 


As n becomes infinite the right-hand side of this saci becomes posi- 
tively infinite, so that u, can approach no limit as g becomes infinite, con- 
trary to the fact established above. If B<0O we deduce a contradiction by 
applying the foregoing proof to the sequence { —u,}. Proceeding by induction 
suppose it has been established that 


and 


let us show the same to be true when & is replaced by +1. We have 


(22.9) (q+ 


By assumption the right-hand side approaches a limit. It follows that 
(q + &)! \ 
A, A+ 
q { 


approaches the same limit. Since 


(22.10) Gat, 


Kg+1- 


1934] 189 
| 
+ k)! 
| 


190 D. V. WIDDER : [January 


also approaches a limit we may apply the argument used for the case k=0 
to the sequence (22.10). It follows that the limit of this sequence must be 
zero and hence by (22.9) that 


qo (k + 1)!q! 
We showed in §21 that 
(n+ k+ 


nik! 


= 0. 


(22.11) 


(i+k+1)! 
= k+1 
+ (— 1) HG + D! 


provided that 


(i+k+1)! 
iso (k + 1)!i! 


= 0. 


But we have just established this latter result. Then by (22.2) and (22.8) 


k +1)! 


we have 


nik! n-+-1 


By Theorem 40 it follows that 


1 
bat 
m+1 ko Jo n+1 


By (22.11) we see that 


Mn+1 
L 


Hence 
— lim imS nid 


1 1 1 
li S n li f mys 


1 


kw 0 


This completes the proof of the theorem. 


n 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 191 


23. Hausdorff’s theorem. Just as our inversion operator enabled us to 
give a proof of Bernstein’s theorem so will the present inversion operator en- 
able us to give a proof of a familiar theorem of Hausdorff. 


THEOREM 42. A necessary and sufficient condition that the equations 


should have a bounded non-decreasing solution a(t) is that the sequence {un} 
should be completely monotonic: 


(— 2 0 (k = 0,1,2,---;"=0,1,2,---). 


The necessity of the condition follows at once from the equation 


(— 1)*A*y, = fora — t)*da(t). 
0 


To prove the sufficiency apply the operator S, to the sequence {u,}. We 
must first show that S;,:{un} exists. To do this we show that for a com- 
pletely monotonic sequence pu, we have 


k)! 
lim @+ 


23.1 
( ) n\k! 


= Ck 


We use induction. The result is immediate for k=0, for the sequence {u,} 
is non-negative non-increasing. Next form the sequence 


Vn = — + 1)Apnsi 
This is also a non-negative sequence. Moreover, 
Ava = — + — 
= — (m+ 2)A% 


This is not greater than zero, so that the sequence {v,} is also non-increasing, 
and must therefore approach a limit. Since u»4: has a limit, the same must be 
true of (7+1)Auns1. We proceed by induction. Suppose that (23.1) has been 
established for k <m. Form the sequence 

(n + 1)(n + 2) 


— 


Vn = (n + 1)Apn+1 + 
(23.2) 


| 

| 
4 
| 
| 
eee i 
+ (~ A™Un+1- 
m'\n! i 
! 


192 D. V. WIDDER 


Simple computation gives 
. (n + m + 1)! 
+ 1)! 
It follows that v, is a non-negative, non-increasing sequence and hence must 
approach a limit. Thus every term in (23.2) except the last is known to have 
a limit. It follows that (23.1) holds for k =m, and the induction is complete. 
In the proof of Theorem 41 we showed that the existence of the limits c, 
implied that they were all zero except perhaps ¢o. This, in turn, implies the 
convergence of the series 


Av, = (- 1) A™™ 0. 


(i+tk+1)! 


Moreover 
1)! 


(— 


(23.4) 

= > (6 + + 2)! (— 
+ 1) + 1)! 

The derivation of this formula involved (23.1) as well as the convergence of 

the series. But the séries on the left surely converges for k =0 since the limit 

of uw, exists. Hence by induction we see that (23.3) converges for all &. 
Having verified that the operator S:,:{un} exists we prove next that it 

defines a non-decreasing function. We have 

+1)! 


= te ki + 1)! 


As t increases n is non-decreasing, so that 


(i+k+1)! 
is surely non-increasing. That is, S:,:{u,} is non-decreasing. The same must 
be true of S;{u,}. Clearly 


kt 
— n= | —|. 


Satin} S — = + (— 


(¢+k+1)! 
S te + + (= 
t=0 


(i+ k)! 


= + ———(— 1) = po. 
imo 


(January 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 193 


This latter result is certainly exact for k=0 and is easily seen to be true in 
general by induction using (23.4). Hence the inequalities (22.8) are satisfied 
by the given completely monotonic sequence {u,}. Hence by Theorem 41 


1 
tm te = f t™dSi un}. 
0 


Now we employ Helly’s theorem precisely as we did in the proof of Theorem 
17 to select from the bounded sequence of non-decreasing functions S;,:{ un} 
a sub-set which approaches a non-decreasing function @(#). As in that proof 
we see that 


Mm — be = f i™dB(t). 
0 


Here 6(1) = —u.. We define a(#) as follows: 
a(t) = (0<#<1), 
a(1) = 0. 


The function a(#) remains non-decreasing since —y,, <0 and 


1 
in = t™da(t) 


This completes the proof of the theorem. { 
The present methods are powerful in the discussion of what sequences 
are moment sequences. We use them to prove one further result of Hausdorff. 


¢ Compare F. Hausdorff, Momentprobleme fiir ein endliches Intervall, Mathematische Zeit- 
schrift, vol. 16 (1923), pp. 220-248. It is of interest to compare Hausdorff’s method of approximation 
to the function a(#) with our own. His kth approximating function x:(#) is a step-function which 
vanishes at #=0 and has a jump of amount 


k 
(- 


at the point n/k (n=0, 1, 2, - - - , &). Our &th approximating function, Sze{ up} , is also a step-function 
vanishing at the origin and with jump of amount 


(- 4 + 
n 


at the points n/(n+k) (n=0, 1, 2, - - - ). Thus the function xz(#) has (k+-1) jumps which depend on 
differences of order less than or equal to &. On the other hand the function Sz,:{u"} has infinitely 
many jumps which cluster about the point #= 1, the amount of the jumps depending on the (&+-1)th 
differences only. We note further that the function Lz,:{u"} isalso a step-function with jumps at the 
same points n/(n-+-k), the amounts of the jumps depending on differences of order k only. 


| 
| 
| 

} 

{ 

(m = 0,1, 2,---). 

| 

_ | 
| 
{ 

| 


194 D. V. WIDDER 


THEOREM 43. A necessary and sufficient condition that 


(23.5) = (n = 0,1, 2,---) 


where a(t) is of bounded variation in (0,1) is that for a suitable constant M 


(p+ k)! 


tk! 


(23.6) | Attu, | < M 

We first prove the necessity of the condition. Assume that y, is defined 
by (23.5) with a(é) of bounded variation in (0,1). Denote by V(é) the total 
variation of a(x) in the interval 0<x<#. Then by Theorem 42 the sequence 


na f rave (n = 0, 1, 2,---) 


is completely monotonic. Hence 
| < (- 1)" 


(— =n = M. 


This completes the proof of the necessity. 
We turn to the sufficiency of the condition. By condition (23.6) we see at 
once that Theorem 41 is applicable to the sequence {u,} so that 


1 
(23.7) — = lim un}. 
0 
Simple computation shows that the function S;,,{u.} is a function of 
bounded variation and that its total variation in the interval (0,1) is not 
greater than M. 

We now employ the theorem of Helly used in the proof of Theorem 19 
to select a convergent sequence of functions from the sequence S:,{un}. The 
limit function a(#) will itself be a function of bounded variation and by (23.7) 
we have 


— Meo = 


If a(t) is redefined at t=1 to be zero there, we have 


itm = 


[January 
p=n he p=0 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 195 


and the proof of the theorem is complete. 
24. The changes of sign ina moment sequence. The operators S and L are 
useful in discussing the changes of sign in a sequence 


Hn = (n = 0,1,2,--- ) 


in terms of the changes of trend in the function a(#) or in terms of the changes 
of sign in the derivative a’(#), if it exists. We begin by proving 


THeEoreEmM 44. If the function a(t) has m changes of trend in the interval (0,1) 
then the sequence 


(24.1) Mn = (n = 0,1, 2,---) 
0 


can have at most m changes of sign. 


For, set 


= — ale-*). 


Since u(x) is continuous (analytic in fact), and since u(m) =», the function 
u(x) would have more than m zeros if the sequence (24.1) had more than m 
changes of sign. But this is impossible by Theorem 22. We next establish 


THEOREM 45. If the function a(t) is a normalized function of bounded varia- 
tion with a(1) =a(1—) =0 and with m changes of trend in (0,1), then the se- 
quence 


(24.2) (— 1)*A*u, —t)*da(t) (n=0,1,2,---) 


has exactly m changes of sign for all k sufficiently large. 


We prove exactly as in the proof of Theorem 23 that corresponding to 
two adjoining intervals (¢;-1, ¢;), (¢:, ti41) of the intervals referred to in the 
definition of number of changes of trend (in the first of which a(é) is increas- 
ing, in the second of which, decreasing) there are three points &, 7, ¢ such that 


a(f)<aln) >a) Gir S ty). 


We then determine kp so large that for k>ko we have 


al 


| 

} 

| 

| 

| 

| 

| 

| 

1 

| 
| 

| 

| 
| 

| 
| 

| 

{ 

i 


196 


D. V. WIDDER 


and such that 
< Sin{ un} > Sit{un} 
for k>ko. This is possible by Theorem 37. Hence 


(¢ + + 1)! k 
(— >0, m Fen | kn | 


kit)! 1-7 
kg 
— 1)*t <0, m= k>ko. 
There is at least one term in each series, so that there must be at least one 
change of sign in the sequence {A**"y,} (n=0, 1, 2, - - - ) between the terms 


m; and my. We are thus able to show as in the proof of Theorem 23 that the 
sequence (24.1) has at least m changes of sign for & sufficiently large. That it 
can not have more follows from Theorem 44. 


Corotrary 1. If the function a(t) has a maximum (minimum) at a point 
t=to, then for k sufficiently large the sequence 


1 
(— 1)*A*u, -f i™(1 — t)*da(t) (n = 0,1, 2,---) 
0 


will have a change of sign at a term n=n, such that 


Nt 


lim 
ko wo Ny + k 


= to. 


For, if € is an arbitrary positive number we must show that a number kp 
exists such that the sequence A*u, has a change of sign at »=m,, where 


Nk 
m+k 


As in the proof of the theorem we see that for & sufficiently large the sequence 
changes sign between the terms with indices m and m where now 


= to — (€/2), = to + (¢/2). 
Suppose that a change of sign occurs at the term =n,. Then 
ké/(1 — m < m < mp — 6). 


<e. 


— te 


Hence 


<E<tote, 


k 

nm +1 k ny 
E < 
m+tk+1 (m+ k)\(m+k+1) 


| [January 
| 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 197 


Since the left-hand side of this last inequality approaches as k becomes 
infinite we can find a number &p so large that 


Nk 


(k > Ro). 


€ 


2 mi +k 


This proves the corollary. 


Corotiary 2. If the function (t) is integrable in (0,1) and has a change of 
sign at to between 0 and 1, then the sequence 


(= 1) = — 
0 


has a change of sign at a term with index n; such that 
lim 
+ 


One proves this by Corollary 1, setting 
1 
at) = f 
t 


and observing that a(#) has a maximum or minimum at éo. This result is a 
generalization of a theorem of Fekete cited in the introduction. Fekete con- 
sidered only functions ¢(#) which are continuous. 

25. Complex variable. Hitherto we have considered only real moment se- 
quences. It is natural to inquire whether our inversion operator is still valid 
for complex sequences. Let 


(25.1) = f 
0 


where now ¢(#) is a complex function. We shall continue to take the path of 
integration as the real axis. By breaking ¢(#) into real and imaginary parts 
we could easily show that 


Li{ un} ¢(t) 


for all real ¢ between 0 and 1. However, we wish also to obtain an inversion 
formula which will be valid for complex ¢. The operator ZL becomes meaning- 
less for complex ¢ since its definition involves the greatest integer contained 
in kt/(1—#). We may define an operator L* which is applicable to any se- 
quence defined by (25.1) as follows. 


198 D. V. WIDDER [January 


DEFINITION. An operator L¥ {un} is defined by the equations 
T(wo + k + 2) ht 
Li clue) = — 1)*A*p,, 


In this definition the factor A*u, is taken to mean 
1 
0 


when ¢ is not an integer. We shall be able to show that this operator inverts 
the sequence (25.1) not only for all real ¢ between 0 and 1 but for all ¢in a 
circle of unit diameter with center at =}. In fact we prove 


THEOREM 46. If the function $(u) is analytic in the circle 
and if 
Bn = f u"(u)du, 
0 


then for any t in that circle 
Li* {un} = (2). 
If to is in the circle C described in the theorem, then 
R(to/(1 — to)) > 0, 
where the symbol R denotes “the real part of.” Hence the integral 


— u)*(u)du, wo = to/(1 — to) 


converges for all positive k, and we have 


1 1 

(25.2) {un} = f — f u*o(1 — u)*du. 

0 0 

By the function u**» we mean exp (kwo log u), the real determination of the 
logarithm being taken. To evaluate the limit of L*,:,{u,} as & becomes in- 
finite we employ the method of O. Perron.t To make use of this method we 


t O. Perron, Uber die naherungsweise Berechnung von Funktionen grosser Zahlen, Sitzungsberichte 
der Akademie der Wissenschaften zu Miinchen, mathematisch-physikalischen Klasse, vol. 7 (1917), 
pp. 191-219. 


1934] THE INVERSION OF THE LAPLACE INTEGRAL 199 


must alter the path of integration in both integrals (25.2) so as to make it pass 
through the point éo. 


Set 
g(u) = u*o(1 — u) = exp {wo log u + log (1 — u)}, 


where wu is in the circle C and where that determination of the logarithm is 
taken which is real when wu is real. It is easily verified that g’ (to) =0. If we set 


/1 —v — bo 
wo) = ( to ) ( 1 — to ). 


the power series development of log 4(v) begins as follows: 


2 
log k(v) = — ————— 
Now define r and 6 by the equation 

re® = — 1/{2to(1 — to)?} 


and apply Perron’s result. We obtain 


f — u)*p(u)du ~ to) , 
0 


1 
f ukoo(1 u)*du 1/2 
0 


It follows that 
lim Li {un} = 
One of the hypotheses assumed by Perron, when stated for the case in hand, 


was that it should be possible to pass a curve from u=0 to u=1 in the circle 
C such that for all points of the curve except u=fo 


| g(u) | <| g(to)| 
To establish the existence of such a curve consider the level lines 
| g(u)| =| g(to)| . 


We can show that they consist of two curves intersecting only at t) which 
divide the circle into four parts. In two of these parts 


| | <| g(to) | 


and in two 
| g(u)| > | g(to)| 


Without setting down the details of the proof we point out that it will 


200 D. V. WIDDER 


be quite sufficient to show that, on the boundary of C, | g(u)| has only two 
relative maxima and only two minima (u=0 and w=1). 
If u=pe**®, simple computation shows that on the boundary of C, where 


p=cos 8, 
| g(u) | = (cos — cos = 


Here a and 3b are the real and imaginary parts of wy respectively. It will be 
sufficient to show that the logarithmic derivative of y(@) vanishes just twice 
for —17/2<0<7/2. But 


— = —atané@ — b+ cot (6/2). 


Since a >0 the curve 
y =atanx+b) 
consists of two branches, one descending from + to —© as x varies from 


— 7/2 to 0, the other descending from + to —© as x varies from 0 to 7/2. 
On the other hand the curve 


y = cot (0/2) 


consists of a single branch ascending from — © to + as x varies from —7/2 


to 7/2. These two curves can intersect in only two points, the points where 
(0) is maximum. One of these points clearly lies between —7/2 and 0, the 
other between 0 and 7/2. The level lines through ¢=¢) must then cut the 
circle at only four points, thus dividing the circle into four parts as described 
above. This completes the proof of the theorem. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


Y 


ON THE DISTRIBUTION OF VALUES OF 
BOUNDED ANALYTIC FUNCTIONS} 


BY 
WLADIMIR SEIDEL} 


1. Let f(z) be a regular analytic function in the unit circle] z | <1. Consider 
any point P of the periphery. Adopting a terminology due to W. Gross§ we 
may associate with P three sets of points: 

(i) The cluster set C(P) of f(z) in P. This is defined as the set of all those val- 
ues a which f(z) approaches on a sequence of points of the unit circle | z| <1 
converging toward P. 

(ii) The range of values R(P) of f(z) in P. A value a belongs to the set R(P) if, 
and only if, f(z) assumes the value a in every neighborhood of the point P. 
(iii) The convergence set T(P) of f(z) in P. The set I'(P) consists of all 
those values a which f(z) approaches on a Jordan arc lying, except for one 
end point, in the interior of the unit circle |z|<1 and terminating in the 
point P. 


In case that the function f(z) under consideration is bounded: | f(z) |<M 
in | z|<1, where M is some positive constant, from well known theorems one 
obtains at once additional information concerning the set I'(P). According 
to a theorem of Fatoul| the limit 


lim f(re#) = f*(e%) (2 = re*) 
r—1 


exists for all values of @ in the interval 0<@<27 save perhaps for a set of 
measure zero of values of 6. The function /*(e*) will henceforth be denoted by 
us as the boundary function of f(z). In our terminology, Fatou’s theorem may 
be stated in the following form: The convergence set T(P) of a bounded analytic 
function f(z) in the unit circle |z|<1 contains at least one point for almost all 
points P of the circumference | z| =1. 


Tt Presented to the Society, October 29, 1932; received by the editors September 27, 1932, and, 
in revised form, February 18, 1933. The author is indebted to Professor J. D. Tamarkin for the sug- 
gestion of basing the proofs of Theorems 2-5 on Theorem 1, thus simplifying the original proofs. 

{A part of these investigations was carried out while the author was a National Research 
Fellow. 

§ W. Gross, Uber die Singularitaten analytischer Funktionen, Monatshefte fiir Mathematik und 
Physik, vol. 29 (1918), pp. 3-47. 

|| P. Fatou, Séries trigonométriques et séries de Taylor, Acta Mathematica, vol. 30 (1906), pp. 
366-368. 


201 


202 WLADIMIR SEIDEL [January 


By the preceding theorem and a theorem of Lindeléff, moreover, it follows 
immediately that the convergence set T(P) of a bounded analytic function f(z) 
in the unit circle | z|<1 contains one and only one point for almost all points P 
of the circumference | z|=1, and no convergence set '(P) can contain more than 
one point. 

As regards the sets C(P) and R(P), it will suffice for the present to remark 
that the set R(P) is always contained in the set C(P). 

The present paper will be concerned primarily with those bounded analyt- 
ic functions f(z) in the unit circle | z | <1, for which the modulus of the boun- 
dary function | f*(e*) | =1 for almost all values of @ in the interval 0<6<27. 
Throughout this paper such functions will be called of class (A). R. Nevan- 
linnat was the first to point out the interest which lies in this class of func- 
tions. There exists a wide range of well known functions which belong to the 
class which we propose to study. Of these we shall mention the following three 
groups: 

(i) If a;, a2, - - - , @m—% are points interior to the unit circle, all functions 
of the form 

TY 
(1.1) R(z) = e#2* [T] ———, 
i=] 1- az 
where & and m are positive integers and a is any real number. These func- 
tions yield the most general (1, m) conformal correspondence of the unit 
circle with itself.§ 

(ii) All infinite products of the form 
(1.2) Biz) = 
diz 


i=1 


where the points a; are interior to the unit circle and satisfy the condition 


(1.3) II| «| > 0. 


It has been shown by W. Blaschkel| that, when condition (1.3) is satisfied, 


t E. Lindeléf, Sur un principe général de l’analyse, Acta Societatis Scientiarum Fennicae, vol. 46 
(1915), No. 4, pp. 1-35. 

t R. Nevanlinna, Uber beschrankte analytische Funktionen, Annales Academiae Scientiarum Fen- 
nicae, (A), vol. 32 (1929), No. 7, p. 64. 

§ For a detailed account of these products, cf. T. Rad6, Zur Theorie der mehrdeutigen konformen 
Abbildungen, Acta Litterarum ac Scientiarum Regiae Universitatis Hungaricae Francisco-Josephinae, 
vol. 1 (1922), pp. 55-64; as well as G. Julia, Principes Géométriques d’ Analyse, part 1, Paris, 1930, 
pp. 54-59. 

|| W. Blaschke, Eine Erweiterung des Satzes von Vitali iiber Folgen analytischer Funktionen, 
Leipziger Berichte, vol. 67 (1915), pp. 194-200; also G. Julia, loc. cit. in preceding footnote. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 203 


the product B(z) converges uniformly in every closed subregion lying wholly 
interior to the unit circle | z|<1, thus defining there an analytic function 
B(z), which is commonly called a Blaschke product. It was shown by F. Rieszt 
that the boundary function B*(e*) of a Blaschke product B(z) has the mod- 
ulus 1 in almost all points of the circumference |z|=1. This proves that 
Blaschke products belong to the class of functions under consideration. In a 
previous paper{ the author showed that if the numbers a; in (1.2) converge to 
a single point P of the circumference | z|=1, then the range of values R(P) 
of the Blaschke product B(z) consists of all the points of the unit circle 
|a|<1, with the possible exception of at most one point. In this paper a 
further study is made of the ranges of values of Blaschke products under 
more general distributions of the zeros {a;}. 

(iii) Consider the unit circle | w|<1 and a set S of points closed rela- 
tively to the circle | w|<1. On removing all points of S from the unit circle 
| w | <1, at least one in general infinitely connected region 2 is obtained. Ac- 
cording to general existence theorems of conformal mapping§ it is known that 
there exists an infinitely multiple-valued function z= ¢(w) analytic in = and 
assuming in it every value from the interior of the unit circle | z | <1 once and 
only once. This function is said to map = conformally on the circle | z| <1. 
As will be proved in this paper, the inverse function w=f(z) of the mapping 
function z=¢(w) belongs under appropriate restrictions on the set S to the 
class of functions under consideration and may be represented as a linear 
function of a Blaschke product. 

Finally, in connection with this work mention should be made of recent 

. papers by G. Héssjer and J. L. Doob.|| 

2. We begin by establishing an integral representation for all functions of 
class (A) in the unit circle | z | <1. Let f(z) be a function of class (A) in | z| <1 
and denote its zeros (if they exist) by a1, dz, - - - , @:, - - - . Since f(z) is bound- 
ed, by a theorem of Blaschke] its zeros satisfy the inequality [];_, | a;|>0. 
In accordance with the result stated in §1, page 202, we may form the Blaschke 
product 

t F. Riesz, Uber die Randwerte einer analytischen Funktion, Mathematische Zeitschrift, vol. 18 
(1923), p. 94. 

For further results concerning Blaschke products cf. J. L. Walsh, Interpolation and functions 
analytic interior to the unit circle, these Transactions, vol. 34 (1932), pp. 523-556. 

t W. Seidel, On the cluster values of analytic functions, these Transactions, vol. 34 (1932), p. 17. 

§ L. Bieberbach, Lehrbuch der Funktionentheorie, vol. I1, 1931, chapter I, pp. 1-84. 

|| G. Héssjer, Uber die Randwerte beschrinkter Funktionen, Acta Litterarum ac Scientiarum 
Regiae Universitatis Hungaricae Francisco-Josephinae, vol. 5 (1930), p. 55. 

J. L. Doob, The boundary values of analytic functions, these Transactions, vol. 34 (1932), pp. 


153-170. 
{| Leipziger Berichte, loc. cit., pp. 194-200. 


WLADIMIR SEIDEL - 


B(z) = I 

extended over the zeros a;. Hence, by the Riesz decomposition theoremf if we 
define the function g(z) by the relation f(z) = B(z) g(z), we find that g(z) is of 
class (A) and different from zero in the circle | z | <1. Consider now the func- 
tion h(z) =log g(z) which, if we select a definite branch of the logarithm, be- 
comes single-valued and analytic in the circle | z|<1. Furthermore, Rh(z) 
<0 in | z| <1. Consequently, applying a result of Herglotzt to the function 
h(z), we obtain for it the representation 


1 +2 
(2.1) = — + 
2rJ_,~e% 
where o(@) is a monotonic non-increasing function of @ in the interval 
7 and Gis some real number. There also exists the following relation§ 
between the function (z) and the derivative o’ (0) of o(6): 


(2.2) lim Rh(z) = o'(A), 


for all values of @ in the interval —7<0<7 for which o(6) possesses a deriva- 
tive, the approach ze being made along any path which is non-tangential 
to the circle | z| <1. Since g(z) is of class (A) in | z| <1, the left-hand side of 
equation (2.2), and therefore 7’(@), is equal to zero almost everywhere. This 
proves the following theorem: 


TuHeEoreEM 1. Let f(z) be a function of class (A) in the unit circle |z| <1. 
Then 
+2 


e® — z 


1 x 
(2.3) f(z) = e*®B(z) exp 


where B(z) is the Blaschke product extended over the zeros of f(z), 0(0) is a mono- 
tonic non-increasing function of 9 in the interval —7 0S whose derivative 
o’ (6) =0 almost everywhere in 7, and B is a real constant.|| 


It is obvious, conversely, that every function f(z) of the form (2.3) is 
of class (A) in | z| <1. 


¢ Mathematische Zeitschrift, loc. cit. 

t G. Herglotz, Uber Potenzreihen mit positivem, reellen Teil im Einheitskreise, Leipziger Berichte, 
vol. 63 (1911), pp. 501-511; p. 508. 

§ Cf. G. C. Evans, The Logarithmic Potential, 1927, pp. 40-43. 

|| This is a special case of a result obtained by V. Smirnoff, Sur les valeurs limites des fonctions 
régulicres a l’intérieur d’un cercle, Journal de la Société Physico-Mathématique de Leningrade, vol. 2 
(1929), pp. 22-37. 


204 [January 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 205 


3. As an immediate consequence of Theorem 1 we prove the following 
theorem: 


THEOREM 2. Let f(z) be a function of class (A), not a constant, in the unit 
circle | z| <1. If f(z) ¥a (|a| <1), in the whole circle | z| <1, then there exists at 
least one radius 6 = such that 


lim f(re) = a. 
r—1 


We may assume without loss of generality that a=0. For if that is not the 
case, we prove the theorem for the function 


f(z) 
1 — af(z) 


which is likewise of class (A) and is different from zero in the whole unit 
circle. Since f(z) has no zeros in | z| <1, formula (2.3) reduces to 


—r 


where o(@) satisfies the conditions of Theorem 1. The function o(@) is not 
identically a constant, for otherwise f(z) would be constant. Hence, if o(@) 
is continuous, there exists a point 0 =» in the interval —7<0<7 at which 
a(0) possesses a derivative equal to —«.} An easy modification of Evans’ 
proofft shows that under these conditions the Poisson-Stieltjes integral 
u(r,¢) = — 
2rJ_, 1+ 17? — 2rcos (0 — 

approaches the limit — © along the radius ¢=6». This proves that | f(z) |, 
which is given by the formula 


| f(z) | = z= 


approaches the limit zero along the radius ¢ = 4». 
Suppose now that o(0) is not continuous in the interval —7 <0 <7. Then, 
since (0) is non-increasing, it admits the following representation: 


= + + (6). 


Here all functions S(@), (0), w(@) are non-increasing; S(@) is continuous and 
S’(0) =0 almost everywhere in the interval —r<0<7, (0) is absolutely 
continuous, and w(@) is a step function. Since o’(6) =0 almost everywhere in 


t Schlesinger and Plessner, Lebesguesche Integrale und Fouriersche Reihen, Berlin and Leipzig, 


1926, §43. 
} The Logarithmic Potential, loc. cit. 


3 


206 WLADIMIR SEIDEL {January 


—a<0<7, 6(6)=0. If S(@) is not constant, by the theorem mentioned in 
the first footnote on page 205, S’(@) = — © at a non-denumerable set of points. 
Among them there will be at least one point at which w(@) is continuous. 
Hence, we may apply to this point the modification of Evans’ proof already 
indicated. There only remains now the case when S(@) is constant. In that 
case w(@) is certainly not constant. It will have at least one point of discon- 
tinuity which, without loss of generality, we may assume to be 6=0. The 
Poisson-Stieltjes integral on the radius 6 =0, 
i— 


1 
u(r, 0) = — dw (6 
( 1+ 7? — 2rcosd @), 


will be of the form 
1 


(3.2) 


where 6 = 6; are the points of discontinuity of w(@) at which w(@) has the nega- 
tive jumps J;, while Jo is the jump of w(@) at the point 6 =0. We now choose 
any positive number « less than | Jo|. To this number there corresponds a 
positive integer ” such that 


k=n+1 
The sum in (3.2) will now be decomposed in the following manner: 


1 — r? 1 —r? 


1 n 
(3.3) u(r,0) = os, 


+ 


We may disregard the sum 


1 
DJ 


— 2rcosh 


9 


— 


since it tends to 0 as r tends to 1. The two remaining terms in (3.3) are alge- 
braically less than 


1+r 


which tends to —® as, tends to 1. 

This completes the proof of Theorem 2. 

It is evident that Theorem 2 still holds if the function f(z) is allowed to 
assume the value a only a finite number of times in the circle | | <1. 


| 
1—~r 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 207 


4. By essentially the same method one may prove the following extension 
of Theorem 2: 


THEOREM 3. Let f(z) be a bounded analytic function, not a constant, in the 
circle |z| <1: 


| f(z)| <1. 
Denoting by f*(e*) the boundary function of f(z), let 


| f*(e#)| = 1 


for almost all values of 0 in the interval A: (0Sa,<0<a2<2m). If the value 
a (|a|<1) is omitted by f(z), then the set of singularities of f(z) on the arc A is 


the closed cover of the set of points defined by the solutions of the equation f*(e*) 
=a. 


We may assume, as in the proof of the preceding theorem, that a=0. The 
representation (3.1) is still valid for the function f(z), where o(@) is a mono- 
tonic non-increasing function in the interval — 7 <@ <7, the relation o’(6) =0, 
however, holding almost everywhere in the open interval a, <@<az. If the 
function f(z) is analytic on the whole arc A, then | f*(e) | =1 for all points of 
A. Therefore, the equation f*(e) =0 has no solutions in the interval a, <@ 
<a. It is immediately evident that the set of singularities of f(z) on the arc 
A contains all points of the closed cover of the set defined by the equation 
f*(e*) =0. In fact, every point z=e* of the arc A for which f*(e*) =0 is a 
singular point of f(z). We therefore have to show now that if P is a singular 
point of f(z) lying on the arc A, then either f*(P) =0 or the points defined by 
the solutions of the equation f*(e”) =0 have P as a limit point. Denote by 
A an arbitrarily small arc of | | =1 which contains the point P in its interior. 
The function o(@) cannot remain constant on the arc A. Indeed, if o(@) were 
constant on A, it would follow from equation (3.1) that 


1 
E f 


where CA denotes the arc of the circle | z|=1 complementary to the arc A. 
This shows that f(z) is analytic on A. Hence, if o(@) is continuous on A, there 
must exist a point of the arc A at which o’(@)=—. Since A was taken 
arbitrarily small, this means that either o’(P) = — © or the points at which 
a’(0)=— have P as a limit point. But now if o’(@)=— ©, for some 8, 
then f*(e) =0 for the same value of 6. If o(@) isnot continuous on A, we can 
reason in a manner similar to that used in the proof of Theorem 2. This proves 
the theorem. 


vad 
‘ 
| 
q 
| 
| 


208 WLADIMIR SEIDEL [January 


5. Dr. J. L. Doob kindly pointed out to the author that, as a consequence 
of Theorem 3, we may prove the following theorem which is a sharper form 
of Nevanlinna’s theorem to be mentioned in §8, page 211: 


THEOREM 4. Let w=f(z) be a bounded analytic function, not a constant, in 
the circle | z| <1: 


|f(@)| <1. 
Let 0 S01 <0 <a2 r=1, be an arc A of the circumference |z|=1 such that 
lim f(re#) = f*(e*), | f*(e*)| = 1, 


for almost all values of 6 in the interval a; <0<a2. Then if all these limit values 
f*(e*) of modulus one are represented by a set of points E on the circumference 
|w|=1, either the set E is the whole circumference | w|=1, or f(z) may be 
continued analytically beyond the arc A. 

Let a=e*, where d is a real constant, be an arbitrary point of the circum- 
ference | w|=1 and let the arc A contain in its interior a singular point P of 
f(z). We wish to prove that there exists a point e#° on the arc A for which 
=a. 

Consider the function 


This function is analytic and bounded in the circle | z| <1, and 


lim = g*(c#), | = 1, 
r—1 


for almost all values of @ in the interval a; <8 <az. Furthermore, the point P 
is a singular point of the function ¢(z). For, since P is a singular point of f(z) 
according to Theorem 8, to be proved in §10, there exists at least one value c 
(|¢|<1) such that the equation f(z)=c has infinitely many solutions %, 
Ze, -- - interior to the unit circle |z|<1 and converging to the point P. 
Hence, it follows from equation (5.1) that 


(Zn) = e(ceta)/ (n = 1, 2, ). 
If P were a regular point of ¢(z), the function ¢(z), and hence also f(z), would 
be constant. Now, according to (5.1) $(z)0 in the circle | z| =1. Hence, by 


Theorem 3 there exists at least one radius # = terminating in a point z=e° 
of the arc A such that 


lim $(re®) = $*(e%) = 0. 


Hence, f*(e*) =a, as was to be proved. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 


6. Theorem 2 may be made to yield the following theorem: 
Tueorem 5. Let f(z) be a function of class (A) in the unit circle |z| <1. 
Then, either f(z) is a rational function of the form 
m—k 


m1 1—dg 


where k and m are positive integers, B a real number, and a; complex numbers of 
modulus less than one, giving the most general (1, m) conformal representation of 
the unit circle into itself; or each value a (|a|<1) belongs to the cluster set 
C(P) of f(z) in at least one point P of the circumference |z|=1. 

Suppose some number a (| a|<1) belongs to no cluster set C(P) of f(z). 
Forming the function 


_ 
1 — Gf(z) 
we see that ¢(z) is analytic and bounded in | z|<1: 
| o(z)| <1, 
and since f(z) is of class (A), the boundary function ¢*(e#) satisfies the rela- 
tion 
(6.3) | | = 1 


for almost all values of @ in the interval 0<@<27. Furthermore, the value 0 
belongs to no cluster set C(P) of $(z). Hence, there exist at most a finite 
number of points 2:, 22, - - - , 2m of |z|<1 at which ¢(z) vanishes. Letting, 
therefore, 


(6.2) (2) 


(6.4) ¢(z) = IL ¥(z), 


k=l Zk 


we have by Schwarz’s Lemma} that W(z) is analytic and bounded in | z| <1 
satisfying there the inequality 


| ¥(2)| <1. 


According to (6.3) its boundary function y*(e*) is 1 in modulus: 
(6.5) | | = 1 


for almost all values of @ in the interval 0<0<2r. Finally, ¥(z) <0 in | z| <1 
and 0 belongs to no cluster set C(P) of ¥(z). Since, as shown by (6.5), ¥(z) 


t G. Julia, loc. cit., p. 67. 


4 

Ag 


210 WLADIMIR SEIDEL. [January 


is of class (A), it follows from Theorem 2, that (z) is identically a constant 
of modulus one. It is now easily seen with the aid of the relation (6.4) that 
f(z) admits the representation (6.1). 

7. We now consider a certain extension of Schwarz’s reflection principle 
which will be found useful in the sequel. It may be formulated in the follow- 
ing manner: 


THeEoreEM 6. Let f(z) be a bounded analytic function in the unit circle |z| <1: 


| f(z)| <1. 


Let 0 Sa; <0<az<27, r=1, be an arc A of the periphery of the unit circle such 
that 


lim | f(re®)| = 1 
r—1 


for almost all values of 0 in the interval a; <0<az. Then, either f(z) may be con- 
tinued analytically beyond the arc A or every value a (| «| <1) belongs to at least 
one of the cluster sets C(P) of f(z) for some point P of A. 

The theorem need merely be proved for the value a =0. Indeed, let us as- 
sume that the theorem is true for a=0 and let 8+0 be some other value such 
that | 8 | <1. We wish to show that unless f(z) may be continued analytically 
beyond the arc A, the value 8 is contained in at least one of the sets C(P). 
For suppose that were not the case. Consider the function 


f(s) 
1 — Bf(s) 


Then, clearly | (z) | <1 and for almost all values of 0 in the interval a; <0 <az 
we have 


o(z) = 


lim | (re) | = 1. 
r—1 


Furthermore, we know that ¢(z) may not be continued analytically beyond 
the arc A and the value 0 is contained in none of the sets C(P) formed for 
the function ¢(z). This, however, contradicts our hypothesis, according to 
which the theorem was true for the value a=0. 

In order to prove the theorem for a=0, we observe that if the value 0 is 
contained in none of the sets C(P) formed for the function f(z), there exists a 
region R lying in the interior of the unit circle and with the arc A as part of 
its boundary, in which 1 >| f(z) |>p, where p is some positive number. The 
remainder of the proof is analogous to the standard proof of Schwarz’s reflec- 
tion principle. The obvious modifications can be easily supplied by the reader. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 


8. As corollary to Theorem 6 we obtain the following result: 


Coroiiary. Let f(z) be a bounded analytic function in the unit circle 
<1: | f(z) | <1. 

Let 050; <0 <a2<27, r=1, be an arc A of the periphery of the unit circle 
such that 


lim | f(re#)| = 1 


for almost all values of 6 in the interval a,<@<a2. Then, if P is an arbitrary, 
interior point of A, either f(z) is analyiic in P or the cluster set C(P) formed for 
f(z) is the closed unit circle | «| <1. 


Suppose P is not a point of analyticity of the function f(z). The corollary 
follows at once if one applies Theorem 6 to a sequence of arcs A, of the 
periphery containing the point P whose lengths tend to zero with 1/n. 

This corollary sharpens Theorem 5 in that in the second case of that the- 
orem every point P of the periphery | z| =1 is either a point of analyticity of 
f(z) or C(P) is the closed unit circle | «| <1. 

Finally it may be mentioned in passing that by means of Theorem 6 one 
may easily prove the following theorem of R. Nevanlinna?: 


THEOREM OF R. NEVANLINNA. Let f(z) = w be a bounded analytic function in 
the unit circle | z|<1: | f(z) | <1. Leta. <0<aa,r=1, be an arc A of the periph- 
ery of the unit circle such that 


lim fire) = f*(e*), | #*(e*) | = 1, 


for almost all values of 0 in the interval a,<@<az. Then, if these limit values 
f*(e*) are represented by a set of points E on the periphery of the unit circle 
|w|=1, either the set E is measurable and of measure 27, or f(z) may be con- 
tinued analytically beyond the arc A. 


9. Again let w=f(z) be a function of class (A) in the circle |z| <1. Denote 
by z=¢(w) the inverse function of w=f(z). This function is in general in- 
finitely many-valued. In the theorem that follows, we establish a connection 
between the non-algebraic singularities of the function ¢(w) and the conver- 
gence values of the function f(z) in precisely the same manner that Hurwitz 
and Iversen{ established such a connection for functions meromorphic in the 
finite plane. Before stating the theorem, we shall give Bieberbach’s definition 
of a singular point: 

7 R. Nevanlinna, Annales Academiae Scientiarum Fennicae, loc. cit., p. 28. 

t A. Hurwitz, Sur les points critiques des fonctions inverses, Paris Comptes Rendus, vol. 143 


(1906), pp. 877-879, and vol. 144 (1907), pp. 63-65; F. Iversen, Recherches sur les Fonctions Inverses 
des Fonctions Méromorphes, Thése, Helsingfors, 1914, p. 13. 


211 
4 
Rit, 
4 
4 
4 
iat 
‘a 
“1 
Li 
| 
iF 
vd 


212 WLADIMIR SEIDEL [January 


If the members of a chain of regular elements %(w—a) of the function 
2=(w) are obtainable from one another by direct continuation in such a manner 
that their centers have a single limit point and their radii of convergence tend to 
zero, then this chain is said to define a singular point. If w=a is the coordinate 
of the limit point, the singular point is said to lie over the point w=a of the w- 
plane. If the singular point is not algebraic in character, it is said to be non- 
algebraic. 

The theorem which we shall find useful in the sequel is as follows: 


THEoreEM 7. Let w=f(z) be a function of class (A) in the unit circle |z| <1. 
If to some number « (| a| <1), there exists a radius 0 =0, for which 
lim f(re®) = a, 
r—1 
then the inverse function z=¢(w) of w=f(z) has a non-algebraic singularity 
over the point w=a. And, conversely, if ¢(w) has a non-algebraic singularity 
over the point w =a, then there exists at least one radius 0 =o, for which 
lim f(re®) = a. 
Since the proof of this theorem is almost identical with that of Iversen’s 
theorem, we shall omit it here. 
10. We now turn to the study of the ranges of values R(P) of our func- 
tions. As an immediate consequence to the corollary of Theorem 6, we state 
the following theorem: 


THEOREM 8. Let w=f(z) be a bounded analytic function in the unit circle 
|2|<1: 
| f(s) | <1. 


Let 0 S01 <0 <a2<27, r=1, be an arc A of the periphery of the unit circle such 
that 


lim | f(re#)| = 1 
r—1 


for almost all values 0 in the interval a; <0 <az. Then, if P is a singular point of 
f(2) lying in the interior of the arc A, the range of values R(P) is a set of points 
everywhere dense in the unit circle | w| <1. 

Consider a sequence of circles {C,,} about the point P as center with radii 


tending to zero. Denote by V, the set of values which f(z) assumes in that 
part of the circle C,, exclusive of its periphery, which lies in the circle 


+ For further details concerning these matters, cf. L. Bieberbach, Lehrbuch der Funktionen- 
theorie, vol. 1 (1923), pp. 207-217. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 213 


| w|<1. Each of the sets V, is open and by the corollary to Theorem 6 is 
everywhere dense in the circle | w| <1. Furthermore, we have 


R(P) = [I 


Hence, by a well known theorem, the set R(P) is dense in the unit circle 
| w| <1. 

11. Under more restrictive hypotheses it is possible to give a sharper 
statement of Theorem 8. 


THEOREM 9. If w=f(z) is a bounded analytic function in the unit circle 
|z|<1 whose boundary function f*(e*) satisfies the condition | f*(e*) | =1 for 
all values of 6 in the interval 050 <2m except perhaps in a denumerable set, then 
either f(z) is of the form (1.1) or it assumes infinitely often all values of the unit 
circle | w| <1 except perhaps for a denumerable set of values. 

According to Theorem 2, if f(z) omits (or assumes a finite number of 
times) a value w (| w|<1), then there exists at least one radius 0 =6, such 
that 

lim f(re®) = w. 

r—1 
Hence, unless the set of values w which is omitted (or assumed only a finite 
number of times) is denumerable, the function | f*(e*) | <1 for a non-denum- 
erable set of values in the interval 0 <6 <2z. 

The following still sharper theorem was obtained by the author{ in 
another paper: 

Let w=f(z) be a bounded analytic function in the circle | 2|<1:| f(z) | <1. 
Let {nx} be an infinite sequence of points interior to the unit circle converging 
toward 2 =1 in which f(z) vanishes and let A be an arc of the unit circle, —a<0 
<a, z=e*, containing P:(z=1), on which f(z) is continuous except for P and 
assumes values of modulus one. Then R(P) is the unit circle | w|<1 with the 
exception of at most one point. 


As a corollary to this theorem, we may state the following result: 
THEOREM 10. Let w=f(z) be of class (A) in the unit circle | 2| <1. If f(z) 


omits two or more values of modulus less than one, the set of singularities of f(z) 
on the circumference | z|=1 is perfect. 


¢ Cf. C. Carathéodory, Vorlesungen iiber reelle Funktionen, Leipzig and Berlin, 1927, p. 63, 
Theorem 5. 
} These Transactions, loc. cit., p. 17. 


n=1 
4 
| 
We 
aM 


214 WLADIMIR SEIDEL [January 


If f(z) is analytic on the boundary | z|=1, then f(z) is necessarily a ra- 
tional function of the form (1.1). Since, however, the function (1.1) assumes 
all values of modulus less than one, it follows that f(z) possesses a singularity 
P on the circumference | z | =1. This singularity P cannot be isolated. For if 
that were the case, there would exist by Theorem 8 a number ¢ (| c | <1), and 
a sequence of points z = 2; interior to the circle | z | <1 and converging toward 
P such that f(n,) =c. If we now form the function (f(z) —c)/(1—éf(z)) and 
apply the theorem just quoted, we obtain a contradiction. Since the set of 
singularities of f(z) is closed and cannot have isolated points, it is perfect. 

12. We shall now investigate the set = of those values which functions 
f(z) of class (A) assume infinitely often in the unit circle. 

A function of class (A) may omit one value. This is evidently true of the 
function f(z) =e%+?/“-) which omits the value 0. 

Furthermore, even a Blaschke productt, being of class (A), may omit 
one value. This is readily shown by establishing directly from the Weierstrass 
product formula the following identity: 


ani/(xni + 1)| | 


ani + 1 


2 
— rni+i1 


the prime on the product sign indicating that the factor for m =0 is omitted. 
The function on the left evidently omits the value —e~'. The function on the 
right is a Blaschke product of the form (1.2), where 


a, = 
+ 1 


Thus the Blaschke product in (12.1) omits the value —e~'. Since 
w=e+)/@-) js the inverse function of that function which maps the punc- 
tured circle 0<| w |<1 on the unit circle | |<1, we find that the inverse 
function of this mapping function may be represented as a linear function of the 
Blaschke product B(z) in (12.1): 

B(z) + 
+ 
1 + e'B(z) 


13. This last result will now be extended in the following manner: 


+ For Blaschke products cf. §1 of this paper. 


2+ 
exp [=| 
(12.1) 
1— + -| 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 215 


THEOREM 11. Let f(z) be a function of class (A) in the unit circle |2| <1. 
Let the value « (|| <1) which is assumed infinitely often by f(z) in the circle 
|z| <1 be not a convergence value of f(z). That is, for no radius 0=0y does the 
equation 


(13.1) lim f(re®™) = a@ 


r—1 
hold. Then f(z) is equal to a linear function of a Blaschke product: 
e*®B(z) +a 
1 + ae*B(z) 


For consider the function 

f(z) 
1 — af(z) 
By hypothesis f;(z) has infinitely many zeros in points a, 
By Theorem 1 /,(z) can be represented in the following manner: 

+2 


— 


(13.2) fi(z) = 


(13.3) fi(z) = e*®B(z) exp Fal 


where 6 is a real constant, B(z) the Blaschke product extended over the zeros 
of fi(z), and o(@) a monotonic non-increasing function of @ in the interval 
—a<0<rn for which o’(@)=0 in almost all points of —rt<0<7. Now if 
o(@) is not constant in the whole interval —7<6<7, by the reasoning of 
§3 there exists a value 09 of @ such that f,*(e#*) =0 or f*(e**) =a. This contra- 
dicts the non-existence of the relation (13.1). Hence, o(@) is a constant and 
formula (13.3) reduces to 


f(z) = Bz). 
14. Theorem 11 leads immediately to the following result: 
THEOREM 12. Let f(z) be a function of class (A) in the circle |z|<1. Then 


either f(z) is a linear function of a Blaschke product, or to every number a 
(| | <1) there exists at least one radius 0=0, on which f(z) tends to the value c: 


lim f(re®) = a. 
r—1 


By Theorem 7 in the second case, the Riemann surface of the inverse function 
2=9$(w) of w=f(z) has non-algebraic singularities over every point of the unit 
circle |w| <1. 

If f(z) is not a linear function of a Blaschke product, then according to 
Theorem 11, every value a (|a|<1) which is assumed infinitely often by 


ii 

4 
Aa] 
ue 
‘ 
Wad 
at’ 


216 WLADIMIR SEIDEL [January 


f(z) is also a convergence value of f(z). But we know from Theorem 2 that 
every value a(|a| <1) which is omitted (or assumed only a finite number of 
times) is a convergence value of f(z). Hence every value a(|a| <1) is a con- 
vergence value of f(z). The author has been unable to determine whether or 
not functions of the second kind may actually exist. 

15. We may now return to the study of the sets 2 of values which func- 
tions of class (A) assume infinitely often in the unit circle | z|<1. We have 
seen in §12 that there are functions of class (A) which omit one value of mod- 
ulus less than one. It is an easy matter to obtain functions of class (A) which 
omit a non-denumerable infinity of values of modulus less than one. 

In order to construct functions of this kind, consider a set S of interior 
points of the circle | w|<1. Let us assume that the set S is closed relatively 
to the circle and that it has the following property: to every e>0 there cor- 
responds a sequence of circles {C,} which cover the set S and whose radii 
5, satisfy the inequality ‘ 


A set with this property is said to be of logarithmic measure zero. If the points 
of the set S are removed from the interior of the circle | w | <1, there remains 
an open set of points. If this open set is connected, we denote the resulting 
region by Rs. If this open set is not connected (as is a priori conceivable), we 


denote by Rs an arbitrary one of the regions into which the open set is de- 
composed by S. It will be shown a little later that Rs is always a dense set in 
the circle | w|<1, thus proving that the open set in question is always con- 
nected. 

By the fundamental theorem of conformal mapping there exists a func- 
tion z= ¢(w) which maps the region Rs conformally on the unit circle | z| <1 
of the z-plane. This function ¢(w) is infinitely multiple-valued in the region 
Rs if Rs is multiply connected. The Riemann surface ® of ¢(w) is the uni- 
versal covering surfacet of the region Rs. Such a covering surface is unrami- 
fied relatively to Rs and consequently can have non-algebraic singularities 
only over the points of the set S. Let us now consider the inverse function 
w=f(z) of z=¢(w). This function f(z) is always single-valued, analytic, and 
bounded in the circle | 3|<1:| f(z)|<1. The boundary function f*(e*) is 
defined by Fatou’s theorem in almost all points of the circumference r=1, 
0<6<27. Denote by E the set of all those points of the circumference at 
which f*(e) assumes a value belonging to the set S. Since S is of logarithmic 

{ For a detailed account of the subject of conformal mapping of multiply connected regions, 


cf. L. Bieberbach, Lehrbuch der Funktionentheorie, vol. 2, 2d edition, 1931, chapters I and IV, as 
well as H. Weyl, Die Idee der Riemannschen Fliche, 1913, chapter I, especially §9. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 217 


measure zero, it readily follows from a theorem of R. Nevanlinnaf that the 
set E is linearly measurable and of measure zero. Furthermore, it may. be 
shown that the equation f*(e) =a, where a is any complex number of mod- 
ulus less than one, has a solution if, and only if, the point w=a is a point of 
the set S. Hence, | f*(e) | <1 for a set of values of 6 which is of measure zero. 
Thus, for almost all values of @ we have | f*(e*) |=1. This shows that the 
function f(z) is of class (A). On the other hand, it is immediately apparent 
from the definition of f(z) that f(z) omits in the circle | z | <1 all values cor- 
responding to points of the set S. This proves our initial assertion. 
Incidentally, we notice that, according to Theorem 5, f(z), being of class 
(A) and possessing singularities on the boundary |z|=1, assumes a set of 
values everywhere dense in the circle | z | <1. This proves that the region Rs 
is everywhere dense in the circle | w|<1. Hence, if the set S is removed from 
the interior of the circle | w| <1, there remains a single connected region Rs. 
With the aid of Theorem 11 we now prove the following theorem: 


THEOREM 13. Let S be a set of interior points of the circle | w | <1 which is 
closed relatively to the circle and of logarithmic measure zero. Denote by Rg the 
region which is obtained by removing the points of the set S from the interior of 
the circle | w | <1. Let z= (w) be an arbitrary function which maps Rs conform- 
ally on the circle |z|<1. Then the inverse function w=f(z) of z=¢(w) may be 
represented as a linear function of a Blaschke product: 


e*B(z) +B 
(15. 1) f(z) = 1 + Be*B(z) 


» 6 real, |p| <1. 


From the discussion which has preceded the statement of this theorem, it 
is evident that f(z) is of class (A) in the circle | z| <1 and assumes infinitely 
often in this circle every value of modulus less than one which corresponds 
to an interior point of the region Rs. Furthermore, no such value a can be a 
convergence value of f(z), for the Riemann surface 9s is unramified relatively 
to Rs. Consequently, according to Theorem 11, whose hypotheses are satisfied 
by the function f(z), the inverse function f(z) of the mapping function ¢(w) 
may be represented as a linear function of a Blaschke product B(z) of the form 
(15.1). Since f(z) clearly omits all values of S, this theorem also shows that to 
any set S of logarithmic measure zero and closed relatively to the circle 
| w | <1 there corresponds a Blaschke product, or a linear function of a Blasch- 
ke product, which assumes all values of modulus less than one infinitely often 
in the unit circle | z | <1 save those belonging to the set S which it omits. 


+ R. Nevanlinna, Uber die Randwerte von analytischen Funktionen, Commentarii Mathematici 
Helvetici, vol. 2 (1930), pp. 237-244. 


| 
ik 
nf 
ii 
au 
Bs 


218 WLADIMIR SEIDEL [January 


16. Theorem 13 is of some interest in the light of a recent remarkable 
investigation of Besicovitch.f Among other things Besicovitch generalizes 
Weierstrass’s theorem on the behavior of single-valued analytic functions in 
the neighborhood of an isolated essential singularity to the case of a non- 
isolated one. His theorem is as follows: 


THEOREM OF Besicovitcu. [f the set E of essential singularities of a single- 
valued analytic function f(z) is of linear measure zero, then the set of values of 
}(2) in the neighborhood of each of the points of E is everywhere dense on the com- 
plex plane. 


It is natural to inquire whether or not it is possible to extend Picard’s 
theorem in an analogous manner. Theorem 13 gives us the means of answer- 
ing this question negatively, thus showing that the hypothesis of an isolated 
essential singularity, while not necessary for Weierstrass’s theorem, cannot be 
dropped in Picard’s theorem. In fact, we shall prove the following assertion: 

There exist single-valued functions f(z) analytic in the whole z-plane except 
for a set E of essential singularities of linear measure zero and omitting a non- 
denumerable set = of values of logarithmic measure zero. 

Let us consider an arbitrary, closed, non-denumerable set of points S 
of logarithmic measure zero containing the origin w=0 and lying in the in- 
terior of some circle | w | <p <1. On removing the points of the set S from the 
circle | w| <1, we obtain a region Rs. Let w=f(z) be the inverse function of 
that function z=¢(w) which maps Rs conformally on the circle | z| <1, car- 
rying three preassigned points of the circumference | w|=1 into three pre- 
assigned points of the circumference |z|=1. According to the result of §15, 
the boundary function /*(e*) has the property that 


(16.1) | f*(e#) | = 1 


for almost all values @ in the interval 0<@<2r. 

We shall prove now that every point z=e for which | f*(e) | =1 is a 
point of analyticity of f(z). To this purpose, consider the radius @ = 4) and the 
image L of this radius by w=f(z) on the universal covering surface Rs of the 
region Rs. It is clear that the projection of the curve LZ on the w-plane tends 
to a point w =e of modulus one. We shall now show that the curve L from a 
certain point on lies wholly on one sheet of the Riemann surface Ws. If this 
were not the case, then Z would have to wind infinitely many times around 
branch points. Since, however, 2s has no branch points in the ring p< | w| 
<1, the curve L either would have to wind infinitely often in the ring p< | w| 


} A.S. Besicovitch, On sufficient conditions for a function to be analytic, Proceedings of the London 
Mathematical Society, (2), vol. 32 (1931), pp. 1-9. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 219 


<1 or would have to penetrate the circle | w|<p infinitely many times. In 
either case, its projection would not tend to a single limit point. Hence, the 
curve L from a certain point on lies on a single sheet of %s and terminates in 
a definite boundary point P of Rs which lies over the point w=e. Now, the 
function z=¢(w) is clearly analytic in the point P and ¢’(P) 0, since it is 
an interior point of the free analytic curve | w|=1, that is, of a curve whose 
points are not limit points of boundary points not belonging to the curve. 
Hence, f(z) is analytic in the point z =e. Consequently, by virtue of (16.1) 
f(2) is analytic and of modulus one in almost all points of the circumference 
| z|=1 which form an open, everywhere dense set. 

By Schwarz’s reflection principle f(z) may be continued analytically 
across points of this open set on the circumference | z|=1 in accordance with 
the functional relation 


(1/2) = (1/f@) 


Since the set S was so chosen as to contain the origin w=0 this defines a 
function f(z) analytic in the whole plane except for a perfect nowhere dense 
set of linear measure zero of essential singularities on the circumference 
|z|=1. Furthermore, if we denote by S’ the image set of S by an inversion 
in the unit circle, it is evident that the function f(z) omits all values belonging 
to the set >=S+S’. 

17. It is natural to inquire whether Theorem 13 still holds when the set S 
of logarithmic measure zero is replaced by a set of linear measure zero. The 
answer hinges on whether or not the inverse of the mapping function is of 
class (A). In the sequel we shall prove that in general the question is to be an- 
swered in the negative. The sets S with which we shall deal can be charac- 
terized as follows: 

(1) S is closed. 

(2) If the points of S are removed from the plane, the remaining set of 
points is connected. 

(3) There exists a function s(p) which is positive, continuous, and mono- 
tonically increasing for p >0 such that the integral 


is finite for some positive value of k, and there exists a positive number e 
such that for every sequence of circles {C,} with the radii p, which cover S 
the inequality 


Ds(or) > € 


i 

iS 

et 

| 

k 
s(p) 
dp 

0 p 

ah 

| 

af} 

a 

ay 


220 WLADIMIR SEIDEL ’ [January 


is satisfied. All sets S satisfying conditions 1, 2, and 3 will be said to possess 
the property (K). Given a set S and a function satisfying condition 3, denote 
by C(A) the greatest lower bound of sums }>*~, s(p,) corresponding to a se- 
quence of circles {C,} of radii p, which cover S and such that p, <X. The func- 
tion C(A) is non-increasing. Hence, lim,» C(A) =m,S exists and will be called 
the s-measure of the set S. Thus, if s(o) =p* (a>0), condition 3 states that S$ 
is of positive a-dimensional measure (in particular, a=1 gives linear measure 
and a=2 superficial measure). The s-measure of a set, however, may also be 
defined by means of the function 


1 
s(p) = (n > 0), 


l+n 


or more generally by means of the function 


1 
$(p) = (n > 0). 


1 1 1\'*" 
log+t — logst— --- logat— 
og* ~~ logs*~ 


It is evident that a set S with the property (K) may be of linear measure zero. 

In view of a later application which is to be made of the following result, 
we state it in the form of a lemma: 

Lemma. Let S be a set of points possessing the property (K) and lying in the 
interior of a circle | w|<d<1. If the set S is removed from the interior of the 
circle | w | <1, there remains an open connected region Rs. Denote by z=¢$(w) a 
function which maps Rs conformally on the unit circle | z| <1. Then the inverse 
function w=f(z) of z=(w) is not of class (A). 

The function w=f(z) is bounded in the unit circle | z| <1: | f(z)|<1. 
Consequently, by Fatou’s theorem, 


lim f(re®) = f*(e) (z = re“) 
r—1 


exists for almost all values of @ in the interval 0<@<27. It is to be shown 
thatT 

mE(| f*(e#) | < 1) > 0. 
We shall assume that 


(17.1) mE(| f*(e*) | < 1) = 0, 
and derive a contradiction. 


+ The notation mE(| f*(e*) | <1) is used here to denote the linear measure of that set of points 
of the circumference |z|=1 for which | f*(e) | <1. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 221 


We may remove the set S and the point at infinity, w= ©, from the whole 
w-plane, and thus obtain a new region A which we shall map conformally on 
the unit circle | ¢|<1 by means of a function t= 6(w). Consider the inverse 
function w=F(t) which is single-valued and analytic in the circle | ¢| <1. 
R. Nevanlinna proved that F(#) may be represented as the quotient of two 
functions each of which is analytic and bounded in the circle | ¢| <1.+ Hence, 
as may be easily seen from a theorem of F. and M. Rieszf, 


(17.2) tin (pe**) = F*(e*) (¢ = pe*) 


exists for almost all values of 7 in the interval 0 <7 S27. Nevanlinna further 
shows that each finite convergence value (17.2), when represented as a point 
in the w-plane, is a point of the set S. Hence 


(17.3) mE( | F*(e'*)| <r) = 


where by hypothesis \ <1. 

Consider now the set G of all those points of the circle | ¢| <1 in which 
| F(#)|<1. This set of points is evidently open and, as we shall show now, 
connected. If the set G is not connected, it may be decomposed into two or 
more connected open subsets G;, Gz, - - - . Let G; and G2 be any two of these 
subsets and let ¢, be a point of G, and & a point of G2. Let / be any continuous 
path in the circle | ¢|<1 connecting #, and & Then, there exists at least one 
point ¢) on / for which | F(t) |=1. But now F(t) maps the unit circle | ¢| <1 
in a one-to-one manner and conformally on the universal covering surface 
Rs of the region A. Consequently, the curve / is mapped on a curve L lying 
on the surface ®, joining two points w, and wz which lie over two interior 
points of the circle | w|<1 and passing through at least one point wo which 
lies over some point of the closed region | w|=1. We know, however, from 
the definition of the region A that the surface 2, is unramified over the region 
\<|w|<o. Hence, we may deform the curve L continuously into a curve 
L’ which joins the points w; and w2 and lies wholly over the interior of the 
circle | w|<1. Hence, the curve / may be deformed continuously into a 
curve J’ which joins the two points 4 and & and on which | F(#) | <1. Hence, 
the two sets G, and G2 could not have been distinct. This proves that G is a 
connected open set. 

The region G cannot be wholly contained in any circle | ¢|<r<1. 
Furthermore, as follows easily from the maximum principle of analytic func- 


7 R. Nevanlinna, Commentarii Mathematici Helvetici, loc. cit., pp. 250-252. 
t F. and M. Riesz, Uber die Randwerte einer analytischen Funktion, Compte Rendu du Quatriéme 
Congrés des Mathématiciens Scandinaves (1920), pp. 28-30. 


it 
Ve 
tal 
if 
4 
4 
ay 


222 WLADIMIR SEIDEL. [January 


tions, the region G is simply connected. Consider now that connected part 
Rs’ of the surface RM, into which the region G is mapped by the function 
w = F(t). From the definition of G it follows that 9%,’ is that part of the surface 
%. which lies over the region Rs. Furthermore, from the simple connectivity 
of G follows the simple connectivity of #4’. Hence, #4’ is a simply connected, 
unramified, and unbounded covering surface of the region Rs. It follows from 
this at once that 9,’ is the universal covering surface Rr, of Rs. To sum up 
the preceding, we have obtained a simply connected subregion G of the circle 
| | <1 which is mapped in a one-to-one manner and conformally by w= F(t) on 
the universal covering surface Rr, of the region Rs. 

18. Denote by w=/(z) the inverse function of some function z=¢(w) 
which maps the region Rs conformally on the circle |z|<1. The function 
w=f(z) maps, therefore, the circle |z |<1 in a one-to-one manner and 
conformally on the universal covering surface Rr, of the region Rs. Con- 
sequently, by virtue of the result of §17 the function t= (f(z)) establishes a 
one-to-one conformal map between the circle |z|<1 and the subregion G of 
|¢| <1. 

Before completing the proof of the lemma, it is necessary to make some 
additional remarks about the region G. If the point ¢ =e‘ is such that 
(18.1) lim F (pe*") 

pl 

exists, then by virtue of the fact that F(#) can be represented as the quotient 
of two bounded functions and by Lindeléf’s theorem it follows that the limit 
(18.1) exists uniformly in every angle smaller than 180° whose vertex lies 
in the point e‘e and whose bisector falls along the radius joining the origin 
t=0 with the point ee. From (17.3) it follows by applying Egoroff’s well 
known theorem that there exists a perfect set © of positive measure on the 
circumference | ¢|=1 and a positive number r)<1 such that if e* is an ar- 
bitrary point of 2 and A(e*) an angle of 60° whose vertex lies in the point 
e* and whose bisector falls on the radius joining the origin ¢=0 with the point 
t=e'", then | F(#)|<1 for every point ¢ of the angle A(e**) whose distance 
| ¢| from the origin is greater than ro. That part of the angle A(e‘") whose 
points lie at a distance not less than 7» from the origin will be denoted by 
B(e't). Since G was defined as the totality of those points of the circle | ¢| <1 
for which | F(#) | <1, it follows that the region B(e‘*) lies wholly within G for 
every point e* belonging to the set 2. 

Now, the set CQ complementary to is open and consists of denumer- 
ably many open arcs a,b, of the periphery | ¢|=1, no two arcs having any 
points in common. Through each end point a, and 6, draw a line forming an 
angle of 30° with the corresponding radius. These two lines intersect in a 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 223 


point c,. We thus obtain denumerably many circular sectors a,b,c,. Con- 
sider now a point ¢ (| ¢| <1), whose distance | ¢| from the origin t=O is greater 
than ro and which lies outside of all the sectors a,b,c,. Corresponding to this 
point there exists a point e’* of the set © such that the region B(e**) con- 
tains / in its interior or on its boundary. In either case # is an interior point 
of G. There exist at most a finite number of sectors, which we number @,)1¢,, 

, aubucm, falling in part within the circle | ¢|<ro. If n<M, denote by 
a,’ the intersection of a,c, with| ¢|=7. and by b,’ the intersection of 
with | ¢|=70. Let us assume that the sectors a:bic1, - - - , abu have been 
so numbered that the points a, ;, de, be, - - - , au,bu describe the circum- 
ference | ¢|=1 in counter-clockwise sense. Then, at least one of the arcs bide, 
beds, - - - , bu-s@m, bua, contains a subset of positive measure of the set 2. 
Denote any such arc by 6,@,4:. Consider now the contour Dnabn’@i4idn41 as 
shown in the figure. If we remove all the circular sectors which fall within the 
region bounded by the contour b,b,’@n4:0n4:, we obtain a new region G’ which 
is a subregion of G. The boundary of the region G’ is by construction a closed 
rectifiable Jordan curve and it has a set of points of positive measure in 
common with the circumference | ¢|=1, namely that subset of 2 which 
lies on the arc 41. 


b 


atl 


19. We have assumed that (17.1) holds; that is, that the function f(z) 
is of class (A). Consider some point P of the subset of Q which lies on the arc 
bn@n41. Let y be an arc which lies in the region G’ except for its end point P. 
The function w=F(#) maps it on an arc y’ lying on the Riemann surface 


i 

| 

q 

b, 

a, au 

i 

q 

| 


224 WLADIMIR SEIDEL [January 


¥tr, and terminating in some point which lies over a point of the set S. The 
function z=¢(w) maps 7’ on an arc 7’’ which terminates in some point P’’ 
of |z| =1 belonging to the set E(|f*(e)| <1) which by (17.1) is of measure 
zero. 

Consider now a function v(z) defined and bounded in the unit circle 
| z|<1:| »(z)|<1, and such that it does not tend to any limit along all 
paths terminating in points of the set E(| f*(e#) |<1). That such functions 
exist is known from a result of Lusin and Priwaloff.f We have seen in the 
beginning of §18 that the function ¢= (f(z)) maps the circle| z |=1 in a 
one-to-one manner on the region G. Denote by z=g(#) the inverse function 
and consider the new function u(#) =»(g(¢)). It follows from the last paragraph 
that u(é) approaches no limit along any path of the region G (and in particular 
of the region G’) terminating in points of that subset of 2 which lies on the 
arc bpdn41. We thus obtain a bounded analytic function y(é) in a region G’ 
bounded by a single rectifiable Jordan curve and a set of boundary points of 
G’ of positive linear measure such that yu(é) tends to no limit along any path 
in G’ terminating in a point of the set. This, however, contradicts a known 
theorem.{ Thus the assumption (17.1) leads to a contradiction. Consequently, 
mE(|f*(e®)| <1)>0, and f(z) is not of class (A). 

20. By means of the lemma just established we prove 

THEOREM 14. Let f(z) be a function which is analytic and bounded in the unit 
circle | z|<1: | f(z) |<1. Let A:a,<0<az, r=1, be an arc of the circumference 
|z|=1 such that 


(20.1) lim f(re#) = f*(e®), | f*(e) | = 1, 
r—1 


for almost all values of 0 in the interval 0, <0<as. Then either f(z) is analytic 
on the arc A or f(z) assumes in the circle |z|<1 every value a(| «|<1) save 
perhaps for a set S of such values possessing the following property: For every 
function s(p) positive, continuous, and monotonically increasing in p>O such 
that the integral 


[ee (k > 0) 


p 


is finite for some k, and for any positive « there exists a sequence of circles {C,} 
with radii p, covering the set and such that >>,"_45(pr) < e. 


TN. Lusin and J. Priwaloff, Sur l’unicité et la multiplicité des fonctions analytiques, Annales 
Scientifiques de l’Ecole Normale Supérieure, (3), vol. 42 (1925), pp. 157-159. 

} V. V. Golubev, Single-valued analysic functions with perfect singular sets (in Russian), Moscow, 
1916, p. 44. Cf. also F. and M. Riesz, loc. cit., p. 40. 


1934] VALUES OF BOUNDED ANALYTIC FUNCTIONS 225 


Let the function f(z) be not analytic on the arc A. If the set S of values in 
the w-plane which w=/f(z) omits does not satisfy the condition of Theorem 
14, then the set possesses the property (K) defined in §17. Thus, for some 
function s(p) the s-measure of the set S is positive. Consider a monotonically 
increasing sequence of positive numbers {r,} which tend to 1 in the limit. 
Denote by S; that part of S which lies in the circle | |<m, and in general 
by S, that part of S which lies in the ring | Then S). 
Furthermore, it follows immediately from the definition of s-measure that 


Hence, if the s-measure of S is positive, then the s-measure of at least one 
set S, is likewise positive. Consider any such set S,. It possesses the property 
(K) and lies in the interior of some circle | w | <A <1. If the set S,, is removed 
from the interior of the circle | w| <1, there remains an open connected re- 
gion Rs,. Denote by t=¢(w) some function which maps the region’Rs, on the 
unit circle | t | <1. By the lemma stated in §17 the inverse function w= y(#) 
of t= (w) is not of class (A). That is to say, the relations 


(20. 2) lim ¥(ce'r) = y*(e'*), | <A <1, = 
ol 


hold for a set of positive measure of values of 7 in the interval 0 $7 S27. 

Consider now the function ¢=¢[f(z)]. Since f(z) does not assume any 
value of the set S,, and since ¢(w) has the points of S, as its only singularities, 
it follows that the function ¢[ f(z) ] may be continued analytically along every 
path lying in the interior of the unit circle | z|<1. By the monodromy 
theorem, therefore, the function ¢[f(z)] is analytic and single-valued in the 
circle | z|<1 as soon as some definite branch of (w) has been selected. It is 
furthermore immediately evident that 


| <1 

in the circle | z|<1. Next, consider any radius 9 =) terminating in a point 
of the arc A on which f(z) tends to a limit and | f(z) | tends to the value 1. 
Such a radius 0=@) is mapped by w=f(z) on a continuous curve fo in the 
circle | w|<1 terminating in a point Poof the periphery | w | =1. But now the 
function ¢=¢(w) maps the curve o on a curve 7 in the circle | ¢|<1 ter- 
minating in a point II of the periphery | #|=1. Since by hypothesis the 
relations (20.1) hold for almost all points of the arc A, this proves that the 
function ¢[ f(z) ] likewise satisfies the conditions 


(20.3) lim o[ f(re*)] = o*[ f*(e*)], | o*[f*(e*)] | = 1, 


for almost all values of 6 in the interval a,;<@<a2; that is, for almost all 


— 


7 
ah 
th 
| 
| 


226 WLADIMIR SEIDEL 


points of the arc A. Furthermore, by Lindeléf’s theorem, if Io is the point 
e*7o, then 


lim | ¥(cei) | = 1. 


Thus we see that if e is a point for which | f*(e) | =1, there corresponds to 
it a point e»=¢*[ f*(e)] such that | y*(e'™) | =1. Consider now the set E 
of all those points e“ for which (20.3) holds. Then, by Nevanlinna’s theorem, 
stated in §8, the set EZ’ of points 


= 
where e“ describes the set E, is of measure 27. Hence, the relation 
| v* | = 1 


is satisfied in almost all points of the circumference | ¢|=1. This, however, 
contradicts the fact that the relations (20.2) hold for a set of positive measure 
on the circumference |¢|=1. Thus, our assumption that for some function 
s(p) the s-measure of the set S is positive has led to a contradiction. This 
proves the theorem. 

As an immediate corollary of Theorem 14, we state the following exten- 
sion of Schwarz’s reflection principle: 

Let w=f(z) be a bounded analytic function in the unit circle | z| <1: 


| f(z)| <1. 
Let A:a,<0<as, r=1, be an arc of the circumference |z|=1 such that 


lim = | = 1, 


for almost all values of 0 in the interval a, <0 <a. Let s(p) be a positive, contin- 
uous, monotonically increasing function for p>0O such that the integral 


[ee (k > 0) 


p 


is finite for some k. If f(z) does not assume in the unit circle | 2| <1 values which 
form a set of positive s-measure in the circle | w|<1, then f(z) may be continued 
analytically beyond the arc A in accordance with the functional relation 

1 


f(z) 
HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


CONVERSES OF GAUSS’ THEOREM 
ON THE ARITHMETIC MEAN* 


BY 
OLIVER D. KELLOGG 


1.1. Introduction. Let T denote a domain (open continuum) in the plane. 
Let u be a function, continuous in 7, and equal, at each point P of T, to the 
arithmetic mean of its values on each circle with center at P and lying, with 
its interior, in T. According to Gauss’ theorem, this is the property of any 
function u harmonic in T. On the other hand, Koebe{ showed, that con- 
versely, any function with the stated properties is harmonic in T. It is with 
some extensions of this theorem of Koebe we are concerned. 

Undoubtedly, one of the simplest proofs of Koebe’s theorem is as follows. 
Let c be any circle lying with its interior C in 7. Let m be the minimum of u 
in C+c. Then, since u is continuous in C+c, either « assumes the value m 
on ¢, or else there is a point P of C such that u=m at P, and u>~m at all points 
of C+c nearer to c than is P. But, by the mean-value property, the second of 
these two alternatives is impossible. Thus, « assumes its lower bound in C+c 
on c. Applying this result to w—v and to v—u, where » is the function har- 
monic in C, continuous on C+c, and equal to « on C, we conclude that u 
is harmonic in C, and therefore in T. 

This reasoning can evidently be applied in case the circles attached to 
each point P, on which the arithmetic mean of u is equal to the value of u at 
their center P, constitute any infinite set whose radii have 0 as lower limit. 
It fails, of course, if these radii are bounded away from 0. The question then 
arises as to what can be said about a function which coincides at each point 
P with its arithmetic mean on some single circle about P as center. In what 
follows we shall be concerned with functions satisfying a condition of this 
type. 
1.2. All our theorems are valid both in the plane and in space. The proofs 
themselves are independent of the number of dimensions if “circle” is inter- 
preted as “sphere” and “plane” as “space” in the case of three dimensions. 
We shall always suppose that T is a bounded domain. Results of a similar 
character are readily obtained by inversions for unbounded domains, con- 
taining or not the point at infinity, provided these domains do not, with their 

* Presented to the Society, June 23, 1933; received by the editors May 11, 1933. Prepared for 


publication by J. J. Gergen. 
t Koebe, 6. Numbers in heavy type refer to the bibliography at the end of this paper. 


227 


L NIVERSHY 


VOLLEGE OF LIBERAL ap 


fe 
LIBRARY 


i, 
| 
i, 
oF 
eu 
RAN 
of 


228 O. D. KELLOGG [April 


boundaries, fill out the extended plane. The results of some of the theorems 
can easily be extended to domains not having this latter property. This does 
not, however, seem to be the case for those of others. In Theorem V, for ex- 
ample, a condition equivalent to boundedness seems to be essential. 

We denote by ¢ the boundary of T. We associate with each point P of T 
a normal* domain C(P), containing P, and consisting wholly of points of T. 
We denote by c(P) the boundary of C(P). This boundary may coincide 
wholly or in part with ¢. We denote by 7 the set of those points common to / 
and some c(P). 

We denote, for a function u continuous in T+7, by A{u(P)}, or by 
A(u), the value at P of the function harmonic in C(P), continuous in 
C(P)+c(P), and equal to u on c(P). Our theorems are concerned chiefly with 
a continuous function wu satisfying u(P)=A(u) in T. This condition is, of 
course, a generalization of the condition that u be equal at each point P to 
the arithmetic mean of its values on some circle about P as center. In fact, 
when c(P) is a circle about P as center, then A {u(P)} is the arithmetic mean 
of u on c(P). 

2.1. On the bounds of a function satisfying w= A(u). We first apply the 
reasoning instrumental in the preceding proof of Koebe’s theorem. We obtain 


THEOREM I. Let u_be continuous in T+r. Then, if u(P)=A(u) in T, it 
follows that u tends to its lower bound in T on a sequence of points in T tending 
to a point of t. Similarly, if u<A(u), then u tends to its upper bound in T on 
a sequence of points in T tending to a point of t. 


We need consider only the first part of this theorem. Let m denote the 
lower bound of u in 7; and let be the set of points in T7+7 on which u=m. 
If X is void, or if \ contains any point of #, the conclusion follows from the 
continuity of win T+7. Let us suppose then that ) is not void and that it lies 
wholly in 7. Let P be a point of \. Then, by a familiar property of harmonic 
functions, and our hypothesis on wu, it follows that all the points of c(P) are 
points of \. Thus, «=m on a point nearer to ¢ than is P. We conclude that the 
distance from X to ¢ is 0; for if it were positive the set \ would be closed and 
we could select, contrary to fact, a point P of \ such that no other point of 
d lies nearer to ¢ than does P. This proves the theorem. 

2.2. A different form of reasoning gives a somewhat stronger result than 
that embodied in Theorem I. It can be shown in fact that, under the first 
hypotheses of that theorem, if u attains in T its lower bound, u attains that 

* A domain C is normal if the Dirichlet problem for C admits a solution for every assigned func- 
tion continuous on the boundary of C. 


For a set of references on the Dirichlet problem and on harmonic functions in general, the reader 
is referred to Kellogg, 2 and 4. 


1934] CONVERSES OF GAUSS’ THEOREM 229 


bound in T +1 in every neighborhood of every point of t. This is a consequence 
of the following theorem. We note also, as another corollary of this theorem, 
that if u is continuous in T+t and satisfies u(P) =A(u) in T, then u cannot 
assume in T both its bounds without reducing to a constant. 


THEOREM II. Let u be continuous in T +7. Then, if u attains in T its lower 
bound m in T, and if u(P)=A(u) in T, it follows that every point of T is a 
point of some C(P) on the boundary of which u=mi identically. 


Let P; be a point of T at which u=m. Then plainly «=m on c(P;). Now 
let P; be any second point of T. If u(P2) =m, then «=m on c(P2) and there 
is nothing to prove. In the contrary case, let a be a polygonal line, lying in 
T, and having its end points at P, and Py». Consider the points of a at which 
u=m. This set is a non-null closed set. We therefore can select the last point 
P*, in the sense P; to P2, at which u=m. On c(P*) we have u=m. Accord- 
ingly, c(P*) cannot cut a between P* and P;. Thus, P: is a point of C(P*). 
We conclude the truth of the theorem. 

2.3. The question now arises whether, in contradistinction to non- 
constant harmonic functions, a non-constant function having the generalized 
mean-value property can attain one or both of its bounds without reducing 
to a constant. The answer is that in general such a function can assume both 
its bounds. Consider in fact the following example. 

Let O be any fixed point. Let T be the interior of the unit circle about O 
as center. Let b,, n=2, 3, - - - , be the circle of radius 1—1/m about O. Let 
u(P)=0 for OP <1/2 and for P on be, by, - - - ; and let u(P)=1 for P on 
bs, bs, - - - . Finally, let « be harmonic in the region bounded by bn, 5,4: and 
assume continuously the values defined for u on b, and b,4:. Then, noting 
that, if P is on b,, m=2, 3, - - - , we can take for C(P) the interior of ba,., 
and that if P is not on one of the circles 6, we can take for C(P) the interior 
of any sufficiently small circle about P, we see that u is continuous in T and 
has the generalized mean-value property. In addition, we see that « assumes 
in T its lower bound 0 and its upper bound 1. 

2.4. Considerably more information can be obtained in regard to the 
question raised in the preceding paragraph. It can be shown that a function 
u of the prescribed type cannot attain both its bounds if the domains C(P) 
are of a sufficiently restricted character. We consider in the next theorem a 
set of domains satisfying the following conditions: 
(a) the diameter 5(P) of C(P) tends to 0 as P tends to any point of t; 
(b) the boundary c(P) has at least one point in common with c(Q) if there is a 
point of c(P) exterior and a point of c(P) interior to C(Q). 

The second of these two conditions is satisfied, of course, if each c(P) is 


A 
id 
q 
j 


230 O. D. KELLOGG ; [April 


a connected set. Both conditions are satisfied if each c(P) is a circle about P 
as center. 


THEOREM III. Let the domains C(P) satisfy the conditions (a) and (b). Let 
u be continuous in T+7 and let u(P) =A(u) in T. Then, if the bounds of u in 
T are distinct, both these bounds cannot be attained by u in T. 


To prove this we show that the contrary assumption, that wu attains in 
T both its bounds, m <M, implies a contradiction. We first select a point P; 
at which «= WM, and then choose a point Q of ¢ such that the segment P,Q 
lies in T except for its extremity Q. We note that Q is not a point of c(P) for 
any P in 7; for if it were we should have, contrary to hypothesis, m= M, be- 
cause u would be continuous at Q and would tend, according to our remark 
in §2.2, to M on one sequence, and to m on another sequence, of points tend- 
ing to Q. 

Consider now c(P;). On this set we have u=M. Further, c(P;) has at 
least one point in common with P,Q. We denote by P, one such point, ob- 
serving that P; is a point of T. 

Consider next the set of points on P,Q at which wu =m. Since P; is a point 
of some C(P) on the boundary of which u=m, this set is not void. In addi- 
tion, if we adjoin to this set the point Q, the resulting set is closed. Ac- 
cordingly, there is a first point, starting from P2, of P,Q at which u=m. We 
denote this point by P3, observing that P; is a point of T. Now at P:, u=M, 
and on c(P;),uw=m <M. It follows from this, and our choice of P;, that 
is a point of C(P;). We deduce further, on applying the fact that w=M on 
c(P;) and the fact that the domains C(P) satisfy condition (b), that c(P;) is 
contained in C(P;). Thus, 


5(P;) S 6(P3). 


We continue this process. We select a point P, of intersection of P;Q with 
c(P3;), noting that P, is a point of 7. We next select the first point, starting 
from P,, of P,Q at which w= _M. We denote this point by P;, observing that 
C(P;) contains P;. We find also that C(P;) contains c(P;); and accordingly 
that 


5(P1) S 6(Ps) S 5(Ps). 

Proceeding in this manner we obtain an infinite sequence of points 
P,, P2, , lying on P,Q and in T. We have PiP;<PiP3< - - - <P;Q, and 
u(P;) =u(P2) =u(Ps) = --- =M, and u(P;) =u(P,) =u(P2) = --- =m. We 
find also that 


(2.41) S S---. 


1934] CONVERSES OF GAUSS’ THEOREM 231 


We arrive now at the desired contradiction. It follows from (2.41), and 
the fact that the domains C(P) satisfy the condition (a), that the points P, 
do not tend to Q. Accordingly, these points have a limit point P in T. But 
this is impossible; for u is continuous at P, and is therefore distinct from at 
least one of its bounds in some neighborhood of P. 

2.5. It is conceivable that, under the restrictions placed upon u in The- 
orem III, « can attain neither of its bounds in T without reducing to a con- 
stant. This however is not the case. An example illustrating this point fol- 
lows. 

We take for T the interior of the unit circle about some fixed point O as 
center. We shall so define u that it assumes its lower bound in T everywhere 
in T except in certain circles ki, ke, - - - . The domain C(P) will be for each 
P the interior of a circle about P as center. 

Let Q be a point on the boundary of T. Let P;, Ps, - - - be points on OQ 
such that 

0<OP,<OP,<---<00,  limP, =Q. 


n— 


About P, as center we now construct a circle d, of radius p,», choosing p, so 
that each d, lies in T, and is exterior to every other d,, of the set. The circle 
k,, then, shall be the circle about P, with radius 


Tn = prl(l — 


The circles k,, have the following properties. Each is interior to T and each 
is exterior to every other circle of the set. Further, if P is interior to k,, then 
the circle e;(P) with P as center and radius 2r, lies exterior to &, and interior 
to d,, whereas the circle e.(P) with center at P and passing through P,4; lies 
in T. We denote by K, the interior of kp. 

Turning now to the definition of u, we first choose a set of positive num- 
bers B,, Bs, - - -. We take Bj =1. We then choose B, so that the arithmetic 
mean on é:(P’) of the function u2(P), defined as 


B2(r2 — 


for P in K2 and as 0 elsewhere, exceeds B, independently of the position of 
P’ in K;,. This is possible since the mean in question exceeds a constant mul- 
tiple of B, for all P’ in K;. Continuing in this manner we choose, in general, 
Bas: so large that the arithmetic mean on e,(P’) of the function un4:(P), 
defined as 


PP 11) 


| 
44 


i 
{ 
| 
if 
Ay 
it 


232 O. D. KELLOGG [April 


for P in K,4: and as 0 elsewhere, exceeds B, independently of the position of 
P’ in K,. 
We now define u(P) as 
B,(t, — 


for P in K,,n=1, 2, - - - , and as 0 at all other points of T. It is evident then 
that u is continuous in 7 and assumes in T its lower bound, 0. It is clear also 
that, if P is exterior to all the circles k,, then u has the mean-value property 
at P with respect to all sufficiently small circles about P. It remains to con- 
sider the points of k,+Kn. 

Suppose first that P is on k,. For such a point we can take for C(P) the 
interior of the circle with P as center and with radius 2r,; for then C(P) +c(P) 
lies in T and wu is 0 at P and on c(P). 

Suppose now that Pis in K,. Consider the circles e,(P) and e(P). On 
e:(P) the arithmetic mean A,(P) of uw is 0; and on e(P) the arithmetic mean 
A:2(P) of u exceeds B, because of our choice of the B,,. Thus, 


A,(P) < u(P) < A2(P), since 0<u(P) <B,. 


Now as 7 varies from m, the radius of ¢:, to m2, the radius of é2, the circle of 
radius 7 about P as Center remains in 7, and the arithmetic mean of u on 
this circle varies continuously from A, to Az. Hence we can select an 7 so 
that the arithmetic mean of u on the corresponding circle is exactly u(P). 
The interior of this circle we can take for C(P). The function u has then the 
mean-value property at P. Accordingly, wu has all the asserted properties. 

3.1. Sufficient conditions that u be harmonic in T. We turn now to a 
study of the conditions under which a function possessing the generalized 
mean-value property is necessarily harmonic. One result in this direction is 
readily obtained as a corollary of Theorem I. 


THEOREM IV. Let u be continuous in T +1, and satisfy u(P) =A(u) in T. 
Then, if there exists a function v, harmonic in T, such that 


{u(P) — o(P)} =0 (P in T) 


lim 
PQ 


at every point Q of t, it follows that u is harmonic in T. 


We note, in fact, that w—v is continuous in 7 ++ if defined as 0 on #. Ac- 
cordingly, v is continuous on 7 +7 if defined as u on r. It follows that v, and 
therefore that w—v, has the generalized mean-value property in 7. The 
bounds of «—v in T are then both 0. This proves the theorem. 

3.2. The condition as to the existence of v in Theorem IV is satisfied, of 


1934] CONVERSES OF GAUSS’ THEOREM 233 


course, if we suppose that T is a normal domain, and that wu is given as con- 
tinuous in 7++#. This suggests a possible result of a much deeper character, 
namely, that a continuous function having the generalized mean-value prop- 
erty is harmonic if it tends to continuous boundary values at all regular 
points* of the boundary of its region of definition. In the next theorem we 
show that for a bounded function of this type this result is indeed true. 


THEOREM V. Let u be bounded and continuous in T +7; and let u(P) = A(u) 
in T. In addition, let u(P) approach at each regular point Q of t a limit value 
f(Q) when P, while remaining in T, tends to Q. Then, if the values of f(Q) are 
those of a function continuous on t, it follows that u is harmonic in T. 


The proof rests on the following lemma. 


Lemma. Let U be continuous in T +1; and let U(P)=A(U) in T. Let V be 
harmonic in T. Then U—V tends to its lower bound m in T on a sequence of 
points in T tending to a point of t. 


The reasoning in this lemma is an extension of that in Theorem I. Let 
be the set of points in T at which U — V =m. If d is void the conclusion fol- 
lows from the continuity of U—V in T. If \ is not void, and if c(P) is for each 
point P of d interior to T, the conclusion follows as in Theorem I. Let us sup- 
pose, then, that there is a point P» of \ such that c(Po) has a point Q in com- 
mon with ¢. Let W be the function, harmonic in C(Po), continuous in 
C(Po)+c¢(Po), which coincides on c(Po) with U. We observe first that 


(3.21) W-Vem 
in C(P,). In fact, we have 
W(Po) — V(Po) S U(Po) — V(Po) = m; 


so that if (3.21) failed to hold we could select a sequence of points {P,}, 
n=1,2,---,in C(Po), tending to a point P’ of c(Po), such that 


lim {W(P,) — V(Px)} < m. 


But 
lim U(P,) = U(P’) = lim W(?,); 


* A point Q on the boundary r of a region R, bounded or not, is regular for r if the sequence 
solution of the Dirichlet problem for R, corresponding to any set of continuous boundary values f, 
tends to f(Q) at Q. Compare, for example, Wiener, 9, p. 128, or Kellogg, 4, p. 606. In 2, p. 326, 
Kellogg defines regularity (for three dimensions) by means of the Poincaré-Lebesgue barrier concept. 
As a consequence of a theorem due to Lebesgue, the complete proof of which can be found in the ma- 
terial in 2, pp. 326-328, and 4, pp. 607-609, the barrier definition is equivalent to that given above. 
The barrier definition is readily extensible to the plane and its equivalence to the sequence solution 
can be established. 


| 

} 

i 

re 


234 0. D. KELLOGG [April 


and we should thus have, contrary to the definition of m, U— V <m on some 
points of C(Po). The proof is now immediate; for, as a consequence of (3.21), 
we have 


lim {U(P) — V(P)} = lim {W(P) — V(P)} =m (Pin C(P)). 
PQ 


3.3. Returning now to the proof of Theorem V, we first observe that it is 
enough to consider the case in which T lies in the interior S of a circle s of 
diameter less than 1/2. It is easily seen in fact, on applying a transformation 
of similitude, that, if the result holds in this case, then it holds in general.* 

We let v denote the sequence solution of the Dirichlet problem for T, 
corresponding to the boundary values f on #; and we let w denote either of 
the functions, ~—v or v—u. We observe that, as a consequence of our hy- 
potheses and a familiar property? of the sequence solution, the lower bound 
m of w in T is finite. To prove the theorem we show that m cannot be nega- 
tive. We assume the contrary, that m is negative, and arrive eventually at a 
contradiction. 

Let e denote the set of points Q of ¢ at which 


(3.31) lim w(P) S m/2 (P in T). 
PQ 

On applying the preceding lemma we see that this set is not void. Plainly, it 
is bounded and closed. Further, its complement £ with respect to the plane 
is a domain. Since E is open, and contains 7, and T is connected, this will 
follow if we prove that, if Q is any point common to E and the complement 
of T, then Q can be joined to a point of T by a polygonal line not passing 
through e. For this, suppose first that Q is a point of ¢—e. In this case the con- 
clusion is immediate; for then Q is at non-zero distance from e and at zero 
distance from T. Suppose on the other hand that Q is exterior to T++#. Let 
R be a point of ¢ such that no other point of ¢ lies nearer to Q than does R. 
Then R is regular for ¢{; and accordingly, 


lim w(P) = 0 (P in T). 
PR 


It follows that R is not a point of e. We conclude that Q can be joined to a 


* This reduction is useful only in the plane case. 

t The property in question is that the sequence solution lies between the extremes of the assigned 
boundary values. This results directly from the definition of the sequence solution. See, for example, 
Wiener, 10, p. 39, or Kellogg, 2, pp. 317-326. Kellogg’s arguments are for three dimensions. Analogous 
arguments hold, however, in the plane. 

t This is a consequence of any one of several criteria for regularity. For references see, for ex- 
ample, Wiener, 9, pp. 130 and 142. 


1934, CONVERSES OF GAUSS’ THEOREM 235 


point of T by a polygonal line of the required type. Accordingly, E is a do- 
main. 

We now form the conductor potential ~ of the set e.* We first construct a 
sequence {E,}, ~=1, 2, - - - , of normal, unbounded domains, nested in and 
approximating £.{ In particular we construct these domains so that for each 
n the (finite) boundary e, of E, lies in S, and the set E,+e, contains no point 
of e. Then, if T is a three-dimensional domain, we denote by £, the function, 
harmonic in £,, which vanishes at infinity and assumes continuously the 
boundary values 1 on e,. On the other hand, if T is a plane domain, we first 
select a point O of e; then denote by &, the function, harmonic and bounded 
in Z,, which assumes continuously the values log OP on e,; and finally set 


£,(P) = {an + log OP — £,(P)}/an, 


where a, is the value of £, at infinity. In both instances we extend the defini- 
tion of ~, over the points exterior to E,+e, by defining it equal to 1 there. 
Then, at each point P of E the sequence {£,(P)} converges. The limit func- 
tion is the conductor potential é. 

In regard to — we now show that 


(3.32) w(P) = m{1 + &(P)}/2 
for all P in T. Let m be any positive integer. We observe first that 
(3.33) 


in S. This is clear in three dimensions. To see that it is true in the plane we 
have only to note that 
a, < log 1/2 < 0, 

as this implies that £, becomes negatively infinite at infinity. We observe 
next that &, is continuous in S. We see finally that £, is superharmonic in S. 
In fact, if P is any point of S-E,, or of S—S-(£,+e,), then the value of &, 
at P is equal to the arithmetic mean of its values on every sufficiently small 
circle about P as center. On the other hand, if P is a point of S-e,, then the 
value of £, at P exceeds, as we see by applying (3.33), the arithmetic mean of 
its values on every sufficiently small circle about P as center. We conclude, 
as a consequence of a familiar theorem{ on superharmonic functions, that 
£, is superharmonic in S. 

* In this connection see, for example, Wiener, 9, p. 142, and 10, p. 26, or for the three-dimen- 
sional case, Kellogg, 2, p. 330. Kellogg’s treatment of the conductor potential can be extended to the 

lane. 

. t For such a construction see, for example, Kellogg, 2, pp. 317-323. The author is chiefly in- 
terested here in bounded three-dimensional domains. The reasoning, however, is applicable to plane 


and to unbounded domains. 
t See, for example, Kellogg, 2, p. 330. 


fe 
i 
| 
ia 
4 


236 O. D. KELLOGG 


Consider, then, the function 
a,(P) = w(P) — m{1 + &,(P)}/2. 


The function +u—m(1+£,)/2 satisfies the conditions imposed upon U in 
our lemma. On the other hand, +2 satisfies the conditions imposed upon V. 
Accordingly, a, tends to its lower bound m, in T on a sequence of points 
{P;},j=1, 2, ---, in T tending to a point Q of ¢. Now, if Q is a point of e, 
we have 
lim w(P;) 2 m, lim {— m(1 + &)/2} = —m. 


On the other hand, if Q is a point of t—e, we have 


lim_ w(P;) = m/2, 


and also, as we see on applying the fact that &,=0 in S*, 
lim {— m (1+ &)/2} = — m/2. 


We deduce that m, 20. Accordingly, 
- w= m(1 + &,)/2 


in T for every n. Allowing to become infinite, we obtain (3.32). 

The proof of the theorem can now easily be completed. We observe first 
that the capacity of e is 0.7 In fact, if its capacity were positive, it would 
contain at least one point Q regular for the boundarv of £.f But the point Q, 
being regular for the boundary of E, would be regular for ¢ since T is contained 
in E and ¢ contains Q.§ We should therefore have at a point of e 


lim w(P) = 0 (P in T); 
PQ 


and this is impossible. Thus the capacity of e is 0. But now, since the capacity 


* It is plain that this inequality holds if T is a three-dimensional domain. To see that it holds 
in the plane, one need only apply the formulas given by Wiener in 9, p. 142. Essentially, it was in 
order to obtain this inequality that we reduced the problem in the beginning of the proof. 

t For the definition of capacity see, for example, Wiener, 9, p. 143, and 10, p. 26, or Kellogg, 2, 
p. 330. 

t The lemma that every bounded, closed set of positive capacity contains at least one point 
regular for the boundary of the unbounded domain bounded by the set is due in the plane to Kellogg, 
5, and in space to Evans, 1. Evans’ proof is valid in the plane. 

§ This is immediate in view of the equivalence of the barrier definition to the sequence definition. 
See, for example, Kellogg, 2, p. 328. It also follows from Wiener’s fundamental criterion on regular 
points. See Wiener, 9, pp. 130 and 142. 


| 
[April 


1934] CONVERSES OF GAUSS’ THEOREM 237 


of e is 0, & vanishes identically in Z,* and therefore in 7. It follows from 
(3.32) that in T 


wem/2. 


This gives us, as a consequence of the definition of m, the desired contradic- 
tion; and this completes the proof. 

4.1. On a construction of the sequence solution of the Dirichlet problem. 
The reasoning of Theorem I is applicable in another connection. We close 
this paper in obtaining by means of it a theorem concerning a method by 
which the sequence solution of the Dirichlet problem can be constructed. 
This method was first considered by Lebesgue.t Lebesgue’s results were later 
extended by Perkins.f{ 

In this theorem we shall suppose that the domains C(P) satisfy the fol- 
lowing conditions: 

(c) if U is continuous in T +t, then A{U(P)} is continuous in T; 
(d) if U is continuous in T +1, then 


lim A{U(P)} = U@) 
P+ 


at every point Q of t. 

We note that these conditions are equivalent to the following: 
(e) if U is continuous in T +t, then the function U,(P), defined as A{U(P)} in 
T, and as U(P) on t, is continuous in T +1. 

The theorem is then 


THEOREM VI. Let the domains C(P) satisfy conditions (c) and (d) above. 
Let uo be continuous in T +1; and let 


u,(P) = { A(un-1), Pin T, 


4.11 
Pont, m=1,2,---. 


Then at every point P of T the sequence {un(P)} converges to the sequence solu- 
tion v(P), corresponding to the boundary values uo on t, of the Dirichlet problem 
for T. Further, the convergence is uniform on any closed subset of T. 


In regard to the condition (d), we note that (d) is satisfied if the condition 
(a) of §2.4 holds. In fact, if U is continuous in T++#, we can, given P, select 
points P; and P, on c(P), such that 


* For the three-dimensional case see, for example, Kellogg, 3, p. 403. For both cases see Wiener, 
9, p. 142, and 10, p. 26. Kellogg’s proof can be extended to the plane. 


t Lebesgue, 7. 
t Perkins, 8. 


| 

| 

ji 

* 


238 0. D. KELLOGG . [April 


If, now, (a) holds, and if P tends to a point Q of é, then P; and P; tend to Q, 
and U(P;) and U(P2) tend to U(Q). Accordingly, A {U(P)} tends to U(Q). 
Thus, (d) can be replaced by (a) in our theorem. 

Now, as pointed out before, (a) holds if for each point P of T, C(P) is the 
interior of a circle about P. Moreover, if in addition we assume that the ra- 
dius of c(P) is a continuous function of P, then (c) holds. Thus it is enough 
to assume in the theorem that C(P) is for each P the interior of a circle about 
P, and that the radius of c(P) is a continuous function of P. A family of cir- 
cles which satisfies this second condition is that in which the radius of c(P) 
is, for each P, equal to the distance from P to ¢. It was with this family of 
circles that Lebesgue and Perkins were concerned. Lebesgue showed that, if 
T is a normal domain, the sequence defined in (4.11) converges to the solu- 
tion, corresponding to the boundary values u» on 7, of the Dirichlet problem 
for T. Perkins extended Lebesgue’s result to an arbitrary domain, thereby 
obtaining the result of Theorem VI for Lebesgue’s family of circles. In each 
case the method of proof is somewhat different from ours. 

Another point which might be mentioned in connection with the above 
theorem is that, although we are apparently concerned only with the se- 
quence solution corresponding to values on the boundary which are those 
of a function continuous throughout 7'+4, this is in reality the general case. 
Given a function continuous on ¢, we can, of course, always extend its defini- 
tion so that the resulting function is continuous on T +4. 

4.2. We first prove the theorem in the case that um is a superharmonic 
polynomial. In this case we have uo =A (uo) =u; =m, where m is the minimum 
of uw in T+#, and if n=1, 2, - - - . Accordingly 


(4.21) 


and we can conclude at once that the sequence {u,} converges at each point 
P of T+ to a limit u(P). We have, then, to show that w=» in T, and that 
the convergence is uniform in any closed subset of T. Now, of these two propo- 
sitions, the second follows immediately from the first. In fact, since the u, 
are continuous in 7 and since (4.21) holds, the convergence, by a familiar 
theorem, is necessarily uniform in any closed subset of T if the limit function 
is continuous in 7. Accordingly, our problem reduces to showing that u=v 
in T. 

Let 7;, T2, - - - be a set of normal domains nested in and approximating 
T. Let v, be the solution of the Dirichlet problem for T;, corresponding to 
the boundary values on the boundary of T;. Let be defined as 
equal to uo(P) for P exterior to T7,+¢,. Then v is continuous and superhar- 
monic in the plane and we have 


1934] CONVERSES OF GAUSS’ THEOREM 


uo(P) = 


in 7. Further, at each point P of T, the sequence {x} converges to v(P). 
We thus have 


(4.22) 
in T. 

As a consequence of this last inequality and the lemma of §3.2, it is easily 
seen that 


(4.23) u—v2>0 


uo(P) = o(P) 


in T. In fact, for any fixed integer »>0, we have 
u,n(P) A (tUn—1) = A (Un) 


in T. Hence, since u, is continuous in 7+¢# and v is harmonic in 7, the func- 
tion u,—v tends to its lower bound in T on a sequence of points {P;}, 
j=1, 2,---+,in T tending to a point Q of ¢. Now, 


lim u,(P;) = v(P;) S 


the latter by (4.22). It follows that 


in T. Allowing m to become infinite, we see that (4.23) holds. 
We have now only to prove that 


(4.24) »—u 20 


in T. For this we consider the function »,— corresponding to some integer 
k. We shall show that v,—u assumes its lower bound m’ in T+# on a point of 
t. This will of course justify (4.24). In fact, we have 


— U= Uy — = O 


on and 
lim =v—4 


kw 


in T. 

We first observe that, as a consequence of the continuity of the u, and 
the fact that (4.21) holds, « is upper semi-continuous in T++#. Accordingly, 
since 2; is continuous in 7 +#, v,—u is lower semi-continuous there. It follows 
that the set d of points in T7+4#, at which »,—u=™m’, is not void. To obtain 
our desired conclusion we prove first that, if P’ is a point of \-7, then all the 
points of c(P’) are points of X. 


239 | 

if 

i 


240 0. D. KELLOGG ' [April 


Let U,, n=0, 1,--- , be the function, harmonic in C(P’), continuous 
in C(P’)+c(P’), which assumes the values u, on c(P’). Let V be the function 
having the corresponding properties with regard to 2,. Then we have 


in C(P’), and 
lim U,(P’) = u(P’) =m (> — 


no 


It follows that the sequence {U,} converges in C(P’) to a function U har- 
monic in C(P’). Now, we have 


V(P’) — U(P’) S — u(P’) = m’. 
Accordingly, either 
(4.25) 


in C(P’), or else there is a sequence of points {P;}, j=1, 2, - --, in C(P’), 
tending to a point Q of c(P’), such that 


lim {V(P;) — U(P;)} = m' — a, 


where a is positive. But, if the second of these two alternatives holds, we have 


— = lim {V(P;) — U,(P))} 


(4.26) lim {V(P;) — U(P))} 


= m'—a 


for every integer n20. It follows that (4.25) holds; for otherwise, by (4.26), 
we should have 


— u(Q) < m’, 
contrary to the definition of m’. Let, then, Q be any point of c(P’). We have, 
on applying (4.25), 
v%(Q) — un(Q) = lim {V(P) — U,(P)} m' (P in C(P’)) 
PQ 


for every integer n20. It follows that c(P’) consists wholly of points X. 

We can readily deduce now that there is a point of \ on ¢. We have only 
to apply the reasoning of Theorem I. Since v,—w is lower semi-continuous in 
T+ and since 7 ++# is closed, the set is closed. Hence, since \ is not void, 
and t is closed, there is a point P of \ whose distance to ¢ is equal to the dis- 


1934] CONVERSES OF GAUSS’ THEOREM 241 


tance 6 from ) to ¢. If, now, we assume that 6 is positive we immediately get 
a contradiction; for there is a point of c(P) nearer to ¢ than is P and by our 
previous reasoning all the points of c(P) are points of \. The theorem for 
superharmonic polynomials is thus completely established. 

4.3. Turning now to the general case, that in which um is given as continu- 
ous on 7'++#, we let R denote a closed subset of T and ¢ an arbitrary positive 
number. To prove the theorem it is enough to show that 


lim |u, — Se 


uniformly in R. 
Now, by the Weierstrass theorem, we can find a polynomial #@ such that 


| — < 


everywhere in T7+#. Next, we can write 

i= Uo’ 
where uo’ and u»” are superharmonic polynomials. We set uo’ =u @ and 
consider the sequences {un’}, {un’’}, {un} built upon the continuous func- 


tions uo’, uo’, uo!” in the same way that {u,} is built upon uo. 
We note first that 


| < 


in T+# since |uo’”’| <¢/2 on ¢. We note next that, by the conclusion of the 
preceding paragraph, we have uniformly in R 


lim =’, lim = 


where v’ and v’’ are the sequence solutions of the Dirichlet problem for T 
corresponding to the boundary values mo’ and uo”. Thus, since 


Un = Un — Un + thy”, 
we have 


Tim |u, — + 0”| < ¢/2 


uniformly in R. But, denoting by v’’’ the sequence solutién of the Dirichlet 
problem for T corresponding to the boundary values u'”’, we have 


in T. We deduce that uniformly in R 


4 
| 
| 
nt 
| 
| 
| 
ia 
| 
| 
| 
a 


0. D. KELLOGG 


lim |v, — S lim |u, — + 0'”| + <e. 
no 


no 


This completes the proof. 


REFERENCES 


1. Evans, G. C., Application of Poincaré’s sweeping-out process, Proceedings of the National Acad- 
emy of Sciences, vol. 19 (1933), pp. 457-461. 
2. Kellogg, O. D., Foundations of Potential Theory, Berlin, 1929. 
3. ———— On the classical Dirichlet problem for general domains, Proceedings of the National 
Academy of Sciences, vol. 12 (1926), pp. 397-406. 
4. ———— Recent progress with the Dirichlet problem, Bulletin of the American Mathematical So- 
ciety, vol. 32 (1926), pp. 601-625. 
a Unicité des fonctions harmoniques, Comptes Rendus de l’Académie des Sciences, vol. 187 
(1928), pp. 526-527. 
6. Koebe, P., Herleitung der partiellen Differentialgleichung der Potentialfunktion aus der Integral- 
eigenschaft, Sitzungsberichte der Berliner Mathematischen Gesellschaft, vol. 5 (1906), pp. 39-42. 
7. Lebesgue, H., Sur le probléme de Dirichlet, Comptes Rendus de |’Académie des Sciences, vol. 154 
(1912), pp. 335-337. 
8. Perkins, F. W., Sur la résolution du probléme de Dirichlet par des médiations réitérées, ibid., vol. 
184 (1927), pp. 182-183. 
9. Wiener, N., The Dirichlet problem, Journal of Mathematics and Physics, Massachusetts Institute 
of Technology, vol. 3 (1924), pp. 127-146. 
10. ————— Certain notions in potential theory, ibid., vol. 3 (1924), pp. 25-51. 


5 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


242 


CONTRIBUTIONS TO THE THEORY OF 
FINITE FIELDS* 


BY 
OYSTEIN ORE 


The present paper contains a number of results in the theory of finite 
fields or higher congruences. The method may be considered as an application 
of the theory of p-polynomials, which I have developed in a recent papert 
On a special class of polynomials. In this special case the p-polynomials form 
a commutative ring. However, this paper may be read without reference to 
the former investigations and one may say that the method applied is the 
- representation of the finite field in its group ring. It should be mentioned at 
this point that a number of the results have direct applications in the theory 
of algebraic numbers. 

In chapter 1 the special properties of the p-polynomials with coefficients 
in a finite field have been derived and the main results are the theorems that 
every ~-polynomial has primitive roots and that every p-modulus is simple. 
A corollary is the theorem of Hensel, that every finite field has a basis con- 
sisting of conjugate elements. Through the introduction of a symbolic multi- 
plication of elements in a ~-modulus we make every such modulus a ring 
usually containing divisors of zero. The results of this first chapter I have 
previously given without proofs.{ 

In chapter 2 various theorems of decomposition and theorems on prime 
polynomials belonging to a product of p-polynomials have been derived. 
Theorems 4 and 5 seem to be the most interesting of the results. In the next 
chapter these results are applied to the construction of irreducible polynomi- 
als. Theorem 1 gives a general type of irreducible polynomials. Next the com- 
plete prime polynomial decomposition of the simplest p-polynomials are 
given, and it is shown how most known irreducible polynomials (mod #) can 
be obtained in this way, thus obtaining a unified method for deriving various 
formerly known results. In the last paragraph one finds a new class of irre- 
ducible polynomials closely related to the linear fractional substitutions. The 
last chapter contains a few rudiments of the theory of finite fields considered 
as cyclic fields and also a particularly simple proof for the general law of reci- 
procity. 


* Presented to the Society, October 28, 1933; received by the editors December 1, 1933. 

¢ These Transactions, vol. 35 (1933), pp. 559-584. 

tO. Ore, Einige Untersuchungen tiber endliche Kérper, Proceedings 7th Scandinavian Mathe- 
matical Congress, Oslo, 1930, pp. 65-67. 


243 


| 
| 
i 
A 
| 
4 | 
| 
; 


OYSTEIN ORE 


CHAPTER 1. THEOREMS ON FINITE FIELDS 


1. Fundamental properties of p-polynomials. In the following, polyno- 
mials with rational integral coefficients will be studied for a rational prime 
modulus ; since almost all congruences occurring in this paper are taken 
with respect to this modulus, we shall, when no ambiguity is to be feared, re- 
place congruences (mod p) by equalities. 

A polynomial of the form 


(1) F(x) = aox?” + + 


shall be called a p-polynomial. F(x) is reduced when a)=1. The polynomial 


(2) = aox® + +--+ +a, 
is called the polynomial corresponding to F(x); the degree n of f(x) is called the 
exponent of F(x). 

The system of all p-polynomials forms a modulus, but not a ring, since 
the ordinary product of two p-polynomials is not a p-polynomial. One finds, 
however, that the pth power of a p-polynomial (mod ) is again a p-polyno- 
mial; this shows that if 
(3) G(x) = box?” + Bmax? + bmx 
is a second p-polynomial with the corresponding polynomial 
(4) g(x) = + 4+ + + dn, 
then the result of substituting F(x) in G(x) is also a p-polynomial G(F(zx)). 
We therefore are led to the definition of a symbolic multiplication 
(S) G(x) X F(x) = G(F(x)), 
and a simple investigation of the symbolic product gives the following re- 
sults: 


THEOREM 1. The symbolic multiplication is commutative and distributive 
and the polynomial corresponding to a symbolic product is equal to the product 
of the corresponding polynomials of the symbolic factors. 


If consequently 
F(x), F(x) 


are p-polynomials with the corresponding polynomials 
f(x), f(x), 
then the symbolic product 
I(x) = Fi(x) X X F,(x) 


244 [April 


1934] THEORY OF FINITE FIELDS 


has the corresponding polynomial 
a(x) = fix) -- + fr(x). 


We shall say that P(x) is a symbolic prime polynomial, when it is reduced 
and no symbolic decomposition P(x) =A (x) X B(x) exists except when one 
of the factors has the exponent zero. One could also have used the corre- 
spondence stated in Theorem 1 and defined P(x) as a prime p-polynomial 
when the corresponding polynomial (x) is irreducible (mod ). This corre- 
spondence immediately shows 


THEOREM 2. The decomposition of a p-polynomial in symbolic prime factors 
is unique. 


One could also have concluded this from the fact that there exists a Euclid 
algorithm for the symbolic multiplication. When two -polynomials A (x) 
and B(x) are given, one can find two others Q(x) and R(x) such that 


(6) A(x) = B(x) X Q(x) + R(x) 


where the exponent of R(x) is smaller than the exponent of B(x). From (6) 
the existence of a Euclid algorithm follows; there exists a greatest common 
symbolic factor for any two or more p-polynomials. When A(x) and B(x) 
have only the trivial symbolic common factor x, we say that A(x) and B(x) 
are symbolically relatively prime. 

It should be observed that when A(x) is symbolically divisible by D(x), 
then A(x) is also divisible by D(x) in the ordinary sense and conversely. 
From A(x)=Q(x) D(x) follows, namely, when Q(x) =q:(x)-x, that A(x) 
=9:(D(x))- D(x). On the other hand, let A (x) be divisible by D(x) in the ordi- 
nary sense; one can divide A(x) symbolically by D(x) and obtain 


(7) A(x) = Q(x) X D(x) + R(x) = qi(D(x))-D(x) + R(x). 


Here the degree of R(x) is smaller than the degree of D(x), and the second 
equation (7) shows that R(x) =0. This reasoning also shows that the sym- 
bolic Euclid algorithm will contain the same residues as the ordinary Euclid 
algorithm. One obtains in particular 


THEOREM 3. The greatest common symbolic factor of two p-polynomials is 
the same as the ordinary greatest common factor of the p-polynomials. 


When therefore A (x) and B(x) are symbolically relatively prime, then the 
ordinary greatest common factor of A(x) and B(x) is x and conversely. Let 
us also observe that in this case one can determine two p-polynomials X (x) 
and Y(x) such that 


| 
245 | 
| 
i 
a 4 
q 


246 OYSTEIN ORE ’ [April 


(8) X(x) X A(x) + V(x) X B(x) = x. 

The p-polynomials of the greatest interest in the following are the well 
known 
(9) F,(x) = x" — x 
with the corresponding polynomial 
(10) fr(x) = — 1. 

Theorem 1 shows 

THeoremM 4. When f,(x) has the ordinary prime factor decomposition 
(11) a" — 1 = -- + $,(x) 
then F(x) has the symbolic prime factor decomposition 
(12) — 1 = X--- X 
where (x) (i=1, 2, - 1) is the polynomial corresponding to ®;(x). 

2. The roots of p-polynomials. We shall now discuss the properties of the 
roots of a p-polynomial F(x) defined by (1). Since F(x) can be represented as 
the ordinary product of prime factors, it is obvious that the roots will belong 
to some finite field K. For a p-polynomial one has F’(x) =a, and this shows 
that F(x) can only have equal roots when a, =0; this case will always be ex- 


cluded in the following considerations. 
Let and be roots of 


(13) F(x) = 0; 


due to the special form of a p-polynomial one sees that yu +v is also a root of 
(13) and, furthermore, that the pth power y? will also be a root. 

We shall say that a finite modulus M, is a p-modulus, if it has the prop- 
erty that the pth power of every element is contained in it. This definition 
implies that every p-modulus lies in some finite field. We can now show 


THEOREM 5. The roots of a p-polynomial form a p-modulus and every p-mod- 
ulus is the set of roots of a p-polynomial. 


The first part of the theorem follows from the remarks made above. 
Since a p-modulus M, always has elements in some finite field, and since y? 
for each yu is the conjugate of yu it follows that the totality of elements of M, 
will satisfy an equation with rational coefficients. In order to show that this 
is a p-polynomial, let 


+ = 0,1,---,p— 1) 


1934] THEORY OF FINITE FIELDS 247 


be a representation of M, by a basis. Then all elements of M, are seen to sat- 
isfy the equation 


and the fact that the elements ji, - - - , un forma basis shows that the highest 
coefficient does not vanish. 

Theorem 5 gives a correspondence between p-moduli and ~-polynomials; 
we shall derive a few simple consequences. Let F(x) be a p-polynomial with 
the p-modulus M,; when F(x) is symbolically reducible, 


F(x) = Fi(x) X F2(x) = Fi(F2(x)), 


it follows that M, must contain as a sub-modulus the roots Mj of F,(x) (or 
F,(x)). Conversely, if M, contains a sub-p-modulus M, corresponding to a 
p-polynomial F(x), then according to §1, F:(x) must divide F(x) both in the 
ordinary and in the symbolic sense. 


THEOREM 6. The necessary and sufficient condition that F(x) be svmbolically 
reducible is that its p-modulus M, contain a sub-p-modulus. 


We shall say that M, is a prime p-modulus, when it contains no sub- 
p-modulus except the zero modulus. The necessary and sufficient condition 
that M, be prime is that the corresponding p-polynomial be symbolically 
irreducible. When M, and N, are two p-moduli corresponding to F(x) and 
G(x), it is easily seen that M,+N, is also a p-modulus corresponding to the 
least common multiple [F(x), G(x)], and that the cross-cut (M,, N,) is a 
p-modulus corresponding to the greatest common factor (F(x), G(x)). 

Now let uv be an arbitrary element of a finite field; all elements of the form 


(14) Sp = ko + + +-- 


obviously form a p-modulus and a p-modulus generated in this way by a 
single element shall be called simple. There must exist a smallest exponent a 
such that a relation 


+ + mom? + mou = 0 


holds, and the elements of the simple p-modulus (14) can then be represented 
uniquely in the form 


(15) Sp = kop + km? +--+ + 


i 
| 
4 } 
Mi °° Mn x 
| 
F( ) Mn? xP 0 
x)= = 
n n n 
eee Mn? xP 
| 
+ 
3, 
he 
ime 
CS 


248 OYSTEIN ORE ' [April 


From the definition of a prime p-modulus it follows that every prime p-modu- 
lus is simple. It is one of the main results of this theory that 


THEOREM 7. Every p-modulus is simple. 


This theorem will be proved in §4. 

3. Polynomials belonging to a p-polynomial. Let ¢(x) be an arbitrary 
polynomial of degree m; it will be shown that ¢(x) always divides a p-poly- 
nomial F(x). In order to find the p-polynomial F(x) of smallest degree having 
this property, we divide the successive powers x°* by ¢(x) and obtain a set 
of congruences 


(16) +... 4+ gi (mod ¢(x)) (4 =0,1,---). 


Through linear elimination one can obtain a relation (mod ¢(x)) between the 
powers x‘, eliminating 1, x, x”, - - - , x™~' from the right-hand side of (16). 
If v+1 is the first index such that there exists a linear homogeneous relation 
between the first y-+1 polynomials on the right-hand side, then ¢(x) will di- 
vide a ~-polynomial F(x) with the exponent v. The construction of F(x) 
shows that it is the p-polynomial with smallest exponent divisible by $(x) 
and we shall say that ¢(x) belongs to F(x). The following is then easily seen: 


THEOREM 8. Every polynomial $(x) of degree m belongs to a unique p-poly- 
nomial F(x) with the exponent vm. Every p-polynomial divisible by o(x) is 
symbolically divisible by F(x). 

Let next F(x) be an arbitrary p-polynomial without equal roots, and let 
f(x) be the corresponding polynomial. Since each prime factor of f(x) divides 
some «"—1, it follows that there exists a smallest exponent N such that 
x —1 is divisible by f(x). This gives, when applied to F(z), 


THEOREM 9. There exists for each p-polynomial F(x) without equal roots a 
smallest number N such that 


(17) xP" — x = G(x) X F(x). 


We shall call N the index of F(x); every irreducible ordinary factor of 
F(x) has then a degree dividing the index. 

Since every polynomial belongs to some p-polynomial, it follows, in par- 
ticular, that every prime polynomial ¢(x) belongs to some F(x), and it is 
easily seen that one can assume that F(x) has no equal roots. The degree NV’ 
of (x) is then a divisor of the index N of F(x), according to (17). On the other 
hand, ¢(x) is a divisor of the p-polynomial 


= x?” — x, 


1934] THEORY OF FINITE FIELDS 249 


and F(x) is therefore also a symbolic divisor of the p-polynomial F,¥’(x). 
This shows, conversely, that WN is a divisor of NV’, and we obtain 


THEOREM 10. An irreducible polynomial of degree N belongs to a p-poly- 
nomial with the index N, and conversely, every irreducible ordinary factor be- 
longing to a p-polynomial with the index N has the degree N. 


At the close of these considerations I should like to make another observa- 
tion. When one wishes to find the prime function decomposition (mod #) of 
an ordinary polynomial f(x), one usually determines the smallest exponent V 
such that f(x) divides x°” —x.* In order to obtain this, one can construct the 
system of congruences (16); instead of continuing the divisions until 


x (mod f(x)), 


it is usually simpler to eliminate the powers of x on the right-hand side and 
find the p-polynomial #(x) which f(x) divides. When (x) corresponds to 
¢(x) it is only necessary to find the N for which ¢(x) divides x” —1. 

4. Primitive roots. The problem now naturally arises to find the number 
of irreducible polynomials belonging to a given p-polynomial F(x). When 
F(x) has the exponent N, these polynomials are all of degree V. One may 
state the problem in a somewhat different form. We shall say that a root u 
is a primitive root of F(x) =0 when it satisfies no p-equation of lower degree. 
Our problem is then equivalent to the determination of the primitive roots. 
Now let 


(18) F(x) = ®(x)@ X 


be the symbolic prime function decomposition of F(x), in which the expo- 
nents signify the repetition of equal factors; the exponent of ®;(x) is m,;. The 
primitive roots of F(x) are obtained when one omits all the roots of the poly- 
nomials F(x) X ;(x)-! and a common argument in number theory shows that 


(19) 


represents the number of primitive roots. 
The expression (19) can also be interpreted in a different way. Let f(x) 
be the polynomial corresponding to F(x); then according to (18) 


(20) f(x) = )(x)%--- 


* See for instance A. Arwin, Uber Kongruenzen von dem fiinften und hiheren Graden nach einem 
Primzahlmodulus, Arkiv fr Matematik, Astronomi och Fysik, vol. 14 (1918). 


if 
is 
of 
4 
| 
| 
| 
1 
a 
a 
= 
1 1 
m1 mr 
p p 
it 
j 
tg 


250 OYSTEIN ORE [April 


is the prime polynomial decomposition of f(x). Now let ®(f(x)) denote the 
number of residues (modd #, f(x)) which are relatively prime to f(x); one 
finds then for this generalized $-function exactly the expression (19). This 
gives 

THEOREM 11. When the p-polynomial F(x) with the corresponding poly- 
nomial f(x) has the symbolic prime function decomposition (18), then F(x) has 
exactly 


mi 
primitive roots; here the m; denote the exponents of the different prime factors of 
F(z). 
This theorem permits a series of applications. It shows the following, 
first of all: 


There exist primitive roots for all p-polynomials. 


Furthermore: 
The number of irreducible polvnomials belonging to F(x) is (1/N)®(f(x)), 
where N is the index of F(x). 


Since there always exist prime functions of degree N dividing F(x), it fol- 
lows that every p-polynomial has the following property in common with 


— 
The degrees of the ordinary irreducible factors of a p-polynomial always di- 
vide the degree of the prime divisor of highest degree. 


Since every ~-modulus M, forms the set of roots of a p-polynomial F(z), 
and since F(x) has primitive roots, it follows that M, can be generated in the 
form (15) by a primitive root of F(x). This gives the proof of Theorem 7: 

Every p-modulus is simple. 

An important special case is the case where the p-modulus is a finite field 
with p” elements; the corresponding ~-polynomial is then x**—x. Theorem 
11 shows that there exist @(x"—1) numbers yu such that every element can 
be represented in the form 


w@ = do + ay? +--+ + 
We have therefore proved 


THEOREM 12. In a finite field of degree n there exist (1/n)®(x"—1) different 
bases consisting of conjugate elements: 


1934] THEORY OF FINITE FIELDS 251 


Theorem 12 gives the answer to a problem proposed already by Eisen- 
stein*, and partly solved by Schénemann.j The first complete solution was 
given by Henself; it should also be observed that the existence of such a basis 
is a consequence of a much more general theorem by Noether and Deuring§, 
proving the existence of a basis consisting of conjugate elements for an arbi- 
trary Galois field. 

5. Symbolic multiplication. Let F(x) be a p-polynomial with the expo- 
nent m, f(x) the corresponding polynomial and M, the p-modulus of the roots. 
All elements of M, are then of the form 


(22) Q(u) = dou +--+ + 


where p is a primitive root. The number Q(u) belongs to some divisor F(x) of 
F(x) and this divisor can easily be found. If namely F(x) = F,(x) X F.(x) and 
F,(Q(u)) =0, then one must have F;(x) XQ(x)=0 (mod F(x)) or Q(x) =0 
(mod F,(x)), and one finds 
THEOREM 13. When XF2(x) =F (x), then an element (22) in M, be- 
longs to F(x) if and only if 
Q(x) = Qi(x) X Fe(x), 


where Q,(x) is relatively prime to F(x). 


The primitive elements of M, consequently consist of those Q(u) for which 
Q(x) is relatively prime to F(x). 

The existence of a primitive element also permits us to introduce a sym- 
bolic multiplication in a p-modulus and make the p-modulus a ring; and this 
can even be done in several ways. Let u as formerly be a primitive element; 
to define the product of two elements 


a= A(y), 


B= Bix), 


we put 
(23) aX B=BXa= [A(x) X B(x) 
This product is associative, distributive and commutative; it should be ob- 


served that the definition (23) depends essentially upon the choice of the 
primitive element u, because u must be the unit element of the symbolic mul- 


* G. Eisenstein, Uber irreduzible Kongruenzen, Journal fiir Mathematik, vol. 39 (1850), p. 182. 

t Schénemann, Uber einige von Herrn Dr. Eisenstein aufgestellte Lehrsitze etc., Journal fiir Mathe- 
matik, vol. 40 (1850), pp. 185-187. 

t K. Hensel, Uber die Darstellung der Zahlen eines Gattungsbereiches fiir einen beliebigen Prim- 
divisor, Journal fiir Mathematik, vol. 103 (1888), pp. 230-237. 

§ M. Deuring, Galoissche Theorie und Darstellungstheorie, Mathematische Annalen, vol. 107 
(1932), pp. 140-144. 


f i 
eal 
+ 
Re 
7 
Bi 


252 OYSTEIN ORE ‘ [April 


tiplication. The ring M, defined by a particular yu is seen to be isomorphic to 
the ring of all residue-classes (mod f(x)), where f(x) is the polynomial corre- 
sponding to F(x); M, is a field only when F(x) is symbolically irreducible. 
When applied to a finite field, one obtains in particular 

THEOREM 14. Let u be a primitive element in a finite field K,, such that the 
conjugates of u form a basis. Each element in K, is then a p-polynomial in p and 
the symbolic multiplication of these p-polynomials introduces a new definition 
of multiplication in K,. With regard to this multiplication K, is a ring isomor- 
phic to the ring of residue-classes for the double modulus (modd p, x*—1). 


Now let F(x) be a p-polynomial with the symbolic prime polynomial de- 
composition 


(24) F(x) = (x) (er) 
and let us put 
A,(x) = F(x) X 


The primitive roots of ®;(x) =0 are then Q;(u) XA.(u), where Q,(x) is not 
divisible by ®;(x), and where yp as before denotes a primitive root of F(x) =0. 
Every root w oi F(x) is representable uniquely in the form 


w= Riu) X Adu), 


i=1 


where the degree of R,(x) is smaller than the degree of ,(x)“®. This shows 
that each root is uniquely representable in the form 


+H, 
where y; is a root of 
(25) ,(x) a 


The root w is primitive if and only if all »; are primitive roots of their corre- 
sponding equations (25). 
Now let 


be a second p-polynomial and 

the representation of one of its primitive roots. The number 


mtv = (wt 71) +--+ + (ue 


| 
| 
(¢=1,2,---,7r). 


1934] THEORY OF FINITE FIELDS 253 


is then a root of the union [F(x), G(x) ]. When for an index 7 we have e; >f;, 
then the element \;=y;+»; is a primitive root of ;(x)“ =0 as one easily 
sees, and correspondingly for f;>e;. When e;=f; it may happen, however, 
that ; is not a primitive root, but when +2 it is always possible even to a 
fixed u; to choose a v; such that A; is a primitive root, for instance »;= +p. 
When p=2 and ;(x) has the exponent 1, one finds that no primitive root 
v; with the property indicated exists. 


THEOREM 15. Let F(x) be two p-polynomials with the symbolic prime poly- 
nomial decompositions (24) and (26), and let u and v be two primitive roots. 
When for alli e;Af;, thenX=p+vis a primitive root of the least common multiple 
[F(x), G(x) ]. If ec=f; for some i and p¥2, one can always to every primitive p 
find a primitive v such that d is a primitive root of the union. 


6. p/-polynomials. We shall finally make a slight generalization of the 
former theory by considering p/-polynomials 


F(x) = + + and, 


where the coefficients a; are elements of a finite field K, of degree f. The poly- 
nomial corresponding to F(x) is 


F(x) = + +--+ + ay. 


One can define the symbolic multiplication by substitution as in §1, and one 
finds that the symbolic multiplication is commutative and that the polyno- 
mial corresponding to a symbolic product is equal to the product of the corre- 
sponding factors; Theorems 2 and 3 also hold without change. 

The decomposition of 

xP x 

into p/-factors corresponds uniquely to the decomposition of x*—1 into ir- 
reducible factors in Ky. 

The roots of a p/-polynomial form a p/-modulus, i.e., a modulus with the 
following properties: 


1. When yu belongs to M,,, then ay also belongs for all elements a of Ky. 

2. When u belongs to M,,, then w?/also belongs. Every #/-modulus forms 
the set of roots of a p/-polynomial. 

One finds that every polynomial with coefficients in K, belongs to a 
p’-polynomial. The smallest exponent WV such that F(x) divides 


— x 


is called the index of F(x), and Theorem 10 will hold unchanged. One can 
then prove the existence of primitive roots for a ~/-polynomial and obtain 


} 
q 
i 
il 
i 
A 
i 
u 
% 
i 
i, 
u 


254 OYSTEIN ORE ; [April 


similar formulas for their number. It follows that every p/-modulus is simple 
and can be represented in the form 


Mos = cop + ap” 


When applied to a finite field of degree ff’ this gives 


THEOREM 16. In a finite field Kyy of degree ff’ one can always find bases 
with respect to Ky consisting of conjugate elements 


The analogue of Theorem 15 can easily be deduced. 


CHAPTER II. DECOMPOSITION THEOREMS 


1. Identities for x°**—x. Let F(x) and G(x) be two p-polynomials, and 
let a be an arbitrary root of F(x)=0 and @ an arbitrary root of G(x) =0. 
From the definition of the symbolic multiplication it follows that the following 
identities must hold: 


(1) F(x) X G(x) = J] @(x) — 8) = I] 2). 
8 


This simple remark gives, when applied to 2°" —x, 


THEOREM 1. Let f(x) and g(x) be two complementary divisors of x"—1 such 
that 


(2) — 1 = f(x)g(x), 
and let P’(x) and G(x) be the corresponding p-polynomials. Then 
(3) am — = (x) 8) = II Gz) a) 


where a runs through all the roots of F(x)=0 and B through all the roots of 
G(x) =0. 


Using p’-polynomials one obtains a similar theorem for the decomposition 
of «»/"—x. Since «*—1 always has the two factors 


f(x) = 4+--- +241, g(x) = x—1, 


one obtains as a special case of the decomposition (3) the decompositions 
given by 


* E. Mathieu, Mémoire sur l'étude de fonctions de plusieurs quantités etc., Journal de Mathé- 
matiques, (2), vol. 6 (1861), pp. 241-323. 


THEORY OF FINITE FIELDS 


p—1 
a=0 


4 
8 
where f runs through all solutions of 
(5) ty = 0. 
When #/-polynomials are applied one obtains 
6 Qa 
= («” 8), 
where a runs through all elements of K;, while 8 runs through the roots of 


The significance of the conditions (5) and (7) is seen to be the following: 
When £ is a root of (7) it is an element of the finite field K,, of relative degree 
n with respect to K;, and it therefore satisfies an irreducible equation in Ky 
of degree ms, where mg divides n. When a;*) denotes the coefficient of x*s~! 
in this equation, one finds 


8 = — 
np 
and the condition (7) is equivalent to 
(8) = 0 (mod 9), 
np 


or simply a; =0 (mod p) when 1 is not divisible by #. 

2. Decomposition theorems. The object of the following considerations 
is to give a method to determine the prime polynomials belonging to a prod- 
uct F(x) =F,(x) XF.(x) of two /-polynomials, when the prime factors of 
F(x) and F,(x) are known. According to (1) we have the decomposition 


(9) F(x) = J] — 2) = x(x) — a), 

where a and a run through the roots of Fi(x) =0 and F(x) =0 respectively. 
Each root of F(x) then satisfies an equation 

(10) = 

We shall determine all equations (10) satisfied by primitive roots of F(x); 


if 
Be 
1934] 255 
Mok 
“yA 
* 
Ag 


256 OYSTEIN ORE [April 


first, it is obvious that a primitive root can only satisfy (10) when ay is a 
primitive root of F:(x). Next let u be a primitive root of F(x); then according 
to Theorem 16, a; must have the form ai1=Q(u) XF2(u), where Q(x) is rela- 
tively prime to F:(x); when R(u) denotes an arbitrary root of (10), then one 
obtains 

X (R(x) — Q(x)) = 0 (mod F(x)) 
or 
(11) R(x) = Q(x) — K(x) X Fi(x). 


The relation (11) gives the general form of a root of (10), including also the 
case where a; is not a primitive root of F(x). 
Let us next write 


(12) Fi(x) = Gi(x) X Di(x), Fa(x) = Go(x) X D2(x), 


where Gi(x) and G:(x) are relatively prime and D,(x) and D2(x) contain only 
prime factors which are common to F(x) and F.(x). When Q(z) is relatively 
prime to F(x) it follows from (11) that any common factor of R(x) and F(x) 
must be a divisor G(x) of G(x), and this shows that every root of (10) belongs 
to a polynomial 


(13) - G2(x) X De(x) X Fi(x), 


where G2(x) =G2(x) XG2(x). 

In order to determine the exact number of roots of (10) belonging to a 
given polynomial (13), we observe that R(x) must be of the form R(x) 
X G2(x), where R(x) is relatively prime to (13); comparing this with (11) one 
finds 


(14) Ri(x) X Ge(x) + K(x) X Fi(x) = Q(x) 


and our problem is equivalent to the determination of the number of solutions 
R(x) of degree less than the degree of (13) and relatively prime to this poly- 
nomial, i.e., relatively prime to G2(x) since no solution of (14) can have a fac- 
tor in common with F;(x). Since G:(x) is relatively prime to F:(x), it follows 
that (14) has a special solution R,“ (x) such that the general solution is 


(15) Ri(x) = Ri (x) + M(x) X Fi(x), 


where M(x) is an arbitrary polynomial whose degree is smaller than the de- 
gree of G2(x) XD2(x). The total number of polynomials M(x) is then p”, 
where f*=f(g.-+d2) and where J: and dz are the exponents of G2(x) and 
D.(x). One finds by the usual argument in number theory that the number of 
solutions of (15) which are relatively prime to G2(x) will be 


1934] THEORY OF FINITE FIELDS 257 


(16) N = 
where g2(x) is the polynomial corresponding to G2(x) and @is the generalized 
Euler function introduced in §4, chapter 1. A well known property of the ®- 
function shows that the sum of all numbers (16) taken over all divisors 
2(x) of ge(x) is equal to the degree of F2(x) as one should expect. 

THEOREM 2. Let F(x) =F XF2(x) be the product of two p’-polynomials 
and 


F(«) = [J (F(x) — a1) 


the corresponding decomposition, where on runs through all roots of Fi(x). The 
primitive roots of F(x) are roots of the equations 


(17) F2(x) = a, 
where o, runs through the primitive roots of F\(x). When 
F(x) = Gi(x) X Di(x), Fa(x) = Go(x) X D(x), 
where D,(x) and D2(x) contain the prime factors which are common to F,(x) and 
F2(x), then every root of (17) belongs to a polynomial 
(18) D(x) X D2(x) X Fi(x), 
where D(x) is a divisor of Go(x). The exact number of roots belonging to a given 
polynomial (18) is 
(19) N(D) = 
where ds is the exponent of D2(x) and d(x) the polynomial corresponding to D(x). 


This theorem shows, in particular, that the number of roots of the various 
categories of an equation (17) is the same for all primitive a, and the number 
of primitive roots is p/%6(g2(x)), where go(x) is the polynomial corresponding 
to G(x). 

Instead of considering (17) one could have determined the primitive roots 
of F(x) as a root of an equation 


(20) F(x) = ae. 


The common roots of two equations (20) and (17) can be obtained in the fol- 
lowing manner: one can write in the symbolic form a2 X Fi(u) and 
one finds as in (11) that the general root of (20) has the form 


(21) Ri(x) = Qi(x) — L(x) X F2(x). 


f 
an 
if 
ay 4 
a 
3 


258 OYSTEIN ORE [April 


The comparison of (21) with (11) shows that in case of a common root the 
polynomials K(x) and L(x) must satisfy the condition 


(22) Q(x) — Qi(x) = K(x) X Fi(x) — L(x) X F(x). 
This equation is solvable if and only if 
(23) Q(x) = Q:(x) (mod 7(x)), 


where 7 (x) is the greatest common factor of Fi(x) and F.2(x); when the condi- 
tion (23) is satisfied, one obtains exactly p* common solutions from (22), 
where ¢ denotes the exponent of T(x). A special case of particular importance 
is the following: 


THEOREM 3. Let Fi(x) and F.(x) be two p'-polynomials without common 
factor; the equations 


(24) F(x) = ae, Fi(x) = a, 


where a; is a root of F\(x) and az is a root of F(x), have then exactly one root in 
common. 


The common root can be found from (22); when a; is a primitive root of 
F,(x) and a is a primitive root of F(x), then the common root p in (24) is a 
primitive root of F(x) =F;(x) XF.2(x), and this remark gives a simple method 
for determining all primitive roots of F(x). 

3. Applications. The theorems derived in §2 have a number of applica- 
tions. Let us use the former notation and let ¢:(x) be an irreducible poly- 
nomial in K; belonging to the ~/-polynomial F,(x). When a is an arbitrary 
root of ¢:(x), then 


(25) $1(F2(x)) = (F2(x) — a1)(Fe(x) — ay’) - + (Fo(x) — a7”), 


where J, is the degree of ¢:(x). We now join all factors in (9) in the form (25) 
and Theorem 2 gives the following result: 


THEOREM 4. Let ¢;(x) be an irreducible polynomial of degree N, belonging 
to the p'-polynomial F,(x); let F2(x) be a second p’-polynomial and 


= Gi(x) X D,(x), F2(x) = G2(x) X D2(x), 


where D,(x) and D(x) contain the prime factors common to F;(x) and F,(x). 
The polynomial ¢:(F2(x)) is then equal to a product of prime polynomials.belong- 
ing to p'-polynomials 


(26) D(x) X D(x) X Fi(x), 


1934] THEORY OF FINITE FIELDS 259 


where D(x) is a divisor of G2(x). The number of prime polynomials belonging 
to a given polynomial (26) is 


where N is the index of (26), dz the exponent of D:(x) and d(x) the polynomial 
corresponding to D(x). 

There are several cases of Theorem 4 which are of special interest. Since 
all prime factors of ¢:(F2(x)) belong to a multiple of F;(x), it is clear that the 
degrees of all prime polynomials are divisible by N:. In the case where F(x) 
is relatively prime to F2(x), all prime factors of ¢:(F2(x)) belong to some 
D(x) XF,(x), where D(x) is a divisor of F2(x) and the number of prime factors 
belonging to such a given polynomial is simply 


)) 


where N is the index of D(x) XF;(x). There will be exactly 


Ni, N2) 
#(fa(x)) 

irreducible factors belonging to F(x) XF2(x), where Nz is the index of F2(x), 
while there will be only one prime polynomial belonging to F,(x) and dividing 
¢:(F2(x)). The roots of this prime polynomial can easily be obtained from 
(11). 

Theorem 3 gives a surprisingly simple method for determining the prime 
polynomials belonging to a product of #/-polynomials when those of the fac- 
tors are known: 


THEOREM 5. Let F,(x) be relatively prime to F2(x) and let (x) be a prime 
polynomial belonging to F,(x) while o2(x) belongs to F,(x). The greatest common 
factor of the two polynomials 
(27) oi(Fo(x)), $2(Fi(x)) 
is then a prime polynomial belonging to F,(x) X F2(x) and all prime polynomials 
belonging to the product can be determined in this way. 


CHAPTER III. CONSTRUCTION OF IRREDUCIBLE POLYNOMIALS 


1. A class of irreducible polynomials. One of the most interesting but also 
most difficult problems in the theory of higher congruences is the determina- 
tion of irreducible polynomials of a given degree in explicit form. At the pres- 


q 
fl 
+ 
A 
“aa 


260 OYSTEIN ORE ° [April 


ent time this problem has only been solved for very special cases, but it is of 
interest to observe that almost all of the results obtained are closely related 
to the theory of ~/-polynomials. 

Before we illustrate this fact, we shall however use some of the former 
results to obtain a new class of irreducible polynomials. Let f(x) be an ordi- 
nary irreducible polynomial of degree » with coefficients in a finite field Ky. 
We shall suppose in addition that f(x) is a primitive polynomial, i.e. p/*—1 
is the smallest exponent such that 


= (mod f(x)). 
For the p/-polynomial F(x) corresponding to f(x) one then has symbolically 


and the index of F(x) is p/*—1. Theorem 10 then shows that any ordinary 
prime polynomial~-x dividing F(x) has the degree ~/"—1. This gives 


THEOREM 1. When 
f(x) = + ayx™ +a 


is an irreducible primitive polynomial in Ky, then 


is an irreducible polynomial in the same field. 


A consequence of Theorem 1 is obviously that the polynomial 
¢1(x) = / (pf-1) + cee + + Qn 


is irreducible. 
As an illustration of Theorem 1 we may take f(x) =x—a and we obtain 
the well known result that 


= 
is irreducible when a belongs to the exponent p/—1 and hence 
¢(x) = 


is also irreducible when 6 is any divisor of p/—1. 

2. Substitution of a prime polynomial. Our next considerations are based 
on Theorem 4, chapter 2, and this theorem shall be applied particularly for 
the case where F(x) is an irreducible p-polynomial. We use the former nota- 
tions, letting N; and Nz be the indices of F;(x) and F.(x), while fi(x) and 
f,(x) denote the polynomials corresponding to and F(x). 


1934] THEORY OF FINITE FIELDS 261 


Let us first deal with the case where F2(x) symbolically divides F(x) and 
F,(x) contains F2(x) symbolically times. Furthermore let Vi: = , where 
N{ is not divisible by ». The exponent A is obviously the smallest number 
such that p4 is not surpassed by any of the exponents occurring in the sym- 
bolic prime function decomposition of Fi(x). If now e+1s 4, then 
F.(x) X F(x) still has the index N,, and when ¢;(x) is a polynomial belonging 
to F(x) and hence of degree Ni, then according to Theorem 4 ¢:(F2(x)) de- 
composes into irreducible factors of degree N;. If however e+1> 4, then e 
is the largest exponent occurring in F(x) and the index of F(x) X F2(x) must 
be pN,, and hence ¢:(F2(x)) decomposes into factors of degree pN,. 

Next let Fi(x) not be divisible by F2(x). The index of the product 
X F2(x) is 


NiN2 
(i, N2) 


N2] = 


and each irreducible factor of ¢:(F2(x)) will, according to Theorem 4, belong 
to F(x) X F.2(x) or to F(x), and there will be one prime polynomial of degree 
N, belonging to Fi(x) and 


Ni, No 
(Wy Nd) B(fo(x )) = (M1, Na) — 1) 


Ne 
polynomials of degree [N1, N2] belonging to F:(x) XF2(x), where mz denotes 
the degree of f2(x). 


THEOREM 2. Let F,(x) and F.(x) be two p/-polynomials with the indices 
N, and Nz; we shall suppose that F(x) is symbolically irreducible and that 
gi(x) is a prime polynomial belonging to F,(x). When F.(x) divides F(x), then 
gi(F2(x)) is the product of prime polynomials of degree N, except when N, is 
divisible exactly by p4 and F,(x) contains F(x) to the same power p4; then 
oi(F2(x)) is the product of prime polynomials of degree pN,. 

When F,(x) does not divide Fi(x), then o,(F2(x)) contains one prime factor 
of degree Ni, while the remaining factors have the degree [Ni, Nz). 

3. Prime polynomials whose degrees are divisible by . We shall now 
apply the first part of Theorem 2 to obtain various irreducible polynomials 
whose degrees are divisible by p. We shall suppose for the moment that all 
polynomials have rational coefficients, and we put 


F,(x) = x? — ax, 


where the exponent d of a divides p—1 and is identical with the index of 
F,(x). Since we shall suppose that F;(x) is divisible by x” —ax, we must have 


| 

\ 

| 

: 

5, if 
4 


262 OYSTEIN ORE [April 


x™:—1 divisible by x —a, which in turn shows that V,=0 (mod d). To insure 
that the exceptional case of Theorem 2 occurs, we shall have to suppose fur- 
thermore that F;(x) divides x°”1—x but not 


— x) X (x? — = 4 4+... 4 


This shows 

TuHeorem 3. Let a be a rational integer belonging to the exponent d and (x) 
a prime polynomial of degree N divisible by d. Then o(x” —ax) is a prime poly- 
nomial of degree pN, when $(x) does not divide 

When a=1, then d=1 and the last condition of Theorem 3 is equivalent 
to a,:~0, where a; is the coefficient of x”-! in ¢(x). This gives the following 
well known result: 

When $(x) is a prime polynomial of degree N in which the coefficient of x"—} 
does not vanish, then (x? —x) is a prime polynomial of degree pN. 

When applied to ¢(x) =x+<a, this shows that 

x? —x+a, a ~ 0, 


is irreducible. I observe without proof that Theorem 3 can be modified to hold 
in an arbitrary field Ky. 

We shall also make an application of the first part of ‘Theorem 2 to obtain 
in a simple way the results of Serret* and Dicksonf on prime polynomials in 
a field K;, whose degrees are powers of p. Let us denote by II,(x) the product 
of r equal symbolic factors «*/ —x 


For r=” one obtains simply 
(2) (x) = — x, 


The polynomial corresponding to II,(x) is (c—1)’ and all symbolic divisors 
of II,"(x) are of the form II,(x). Since a prime polynomial of degree p* must 
divide (2) every prime polynomial having this degree must belong to a 
unique polynomial 

II,(x) (7 = + 1, + 2,---, p*). 


* See Serret, Cours d’Algébre. 
t See Dickson, Linear Groups. 


1934] THEORY OF FINITE FIELDS 263 


In this way one obtains a division of all prime polynomials of degree p* into 
p"—p*—" classes. The class corresponding to r=*~-'+1 shall be called the 
first class and the class corresponding to r = p* the Jast class of degree p*. Since 


= X — x) = — 


and since all polynomials dividing, but not belonging to, II,(x) must divide 
II,_:(x), it follows that 


T’,(x) = I1( II,-1 (x) a), 
where a0 runs through all of the elements of K;, represents the product of 
all prime polynomials belonging to II,(x). The first part of Theorem 2 gives 
immediately 


THEOREM 4. When $(x) is a prime polynomial of degree p” belonging to the 
class p, then (x?! —x) is the product of p! different prime polynomials of the 
class p+1 except in the case where (x) belongs to the last class of degree p”, 
when (x?!—x) is the product of p'-' prime polynomials of the first class of 
degree 

4. Further applications. The second part of Theorem 2 may also be used 
to obtain results on irreducible polynomials. We saw that, with the same no- 
tation as before, ¢:(F2(x)) contains one irreducible polynomial of degree Ni 
belonging to and 

(Ni, N2) 
= — 
1) 
irreducible polynomials of degree [Ni, N2] belonging to F(x) X F2(x). We see 
that T=1 only when the indices N; and Nz are relatively prime and 
N.=p/™—1. We can then write ¢:(F2(x)) =A(x)-u(x) where A(x) has the de- 
gree N,N; and u(x) divides F:(x). Hence we can write 


u(x) = (di(F2(x)), Fi(x)) 


and this gives the following: Let f2(x) be an irreducible primitive polynomial 
of degree nz and let f;(x) be an arbitrary polynomial belonging to the exponent 
Ni, where (Ni, p'"—1)=1. When F,(x) and are the corresponding p'- 
polynomials and q(x) a prime polynomial belonging to F(x), then 
2(x)) 

(¢:(F2(x)), Fi(x)) 


is a prime polynomial of degree N,(p/"—1). It may be observed that this 


(x) 


By 


264 OYSTEIN ORE [April 
result contains Theorem 1 for F,(x) =x. We shall give a further application 
to the case where 
F\(x) = x? — x. 
One then has V,=1 and ¢,(x) =x—a. Let us suppose 
Fx(x) = 2°" + + Bax, 
and hence 
o:(Fo(x)) = + Bix? + Bax — 


According to the general result this polynomial must contain a linear factor 
x—vy and we find 
a 


TuHeoreoM 5. Let f(x) be an irreducible primitive polynomial of degree n and 
let F(x) be the corresponding p’-polynomial. Then 


ts an irreducible polynomial of degree p/"—1. 


This theorem may be considered as a restatement of Theorem 1. 

5. Decompositions of p/-polynomials. We shall now give the complete 
decomposition into prime factors of a few simple p/-polynomials, thus also 
illustrating the general theorems. 

1. In the simplest case 


F(x) = x” — ax, 


let 5 be the smallest exponent such that a’ =1. The index of F(x) is 6 and one 
finds the prime polynomial decomposition 


F(x) = («* — 8), 
B 


where @ runs through all of the roots of 
= 


2. When 
F(x) = (x! — x) X (x? — x) 


| 
F —a 
a 
F(1) 


1934] THEORY OF FINITE FIELDS 


the irreducible factors must have the degrees 1 and #, and since 


F(x) = J] («” — a), 


where @ runs through all of the elements of Ky, it is sufficient to decompose the 
factors of this product. One finds 


—x—a= [J (x? —ar'x— 8), a ¥0, 
8 


where fa-? runs through all solutions of 

One can also show that 

f(x) = x? — a? 1x — 8B 

is reducible if and only if Ba~ satisfies 

te 
3. In the case 
F(x) = (x? — ax) X — x), = 1, 


it follows from the general theory that the irreducible factors are of degree 
1 and 6. One obtains 


F(x) = Il (x — ax — B), 


and putting ¢=8/(1—a) one finds 
—ax —B = (x ((x — y), 


where y runs through«all solutions of 
18 = a. 


At this point it may be of interest to determine the decomposition of a poly- 


nomial 
f(x) = x”? —ax— 8B. 


This problem occurs in connection with the determination of prime ideal 
decompositions in relative Kummer fields. The number 
a = 


is rational and we can suppose a1 since this case has been treated under 2. 
One finds that f(x) has the root 


265 


OYSTEIN ORE 
1 
> (p—1) 
— 
in K; and consequently 
x? —ax —B = (x [] —d), 


6. Irreducible polynomials and linear substitutions. Now let 
(3) F(x) = 2” + axe’ + Bx 
be a ~/-polynomial whose corresponding polynomial 
(4) f(x) = 8 


is irreducible in K;. In order to study the prime polynomial decomposition 
of F(x) we put ¢=«*/—" and obtain 
Any root of (¢) must satisfy the relation 
B 


and so we are naturally led to the study of irreducible polynomials whose 
roots are connected by linear substitutions 


ax+ 

yx +6 

Such prime polynomials are obviously divisors of 

(7) W(x) = + — ax — B. 


(6) = 


In (6) and (7) we can always assume y <0 since the polynomials 
(8) w+ 


have been completely decomposed in Nos. 2 and 3 of the preceding section. 
We are also mainly interested in the case where the linear substitution (6) 
leaves no element of Ky; unchanged. We suppose then that the equation 
ax+B 


where 
t 


1934] THEORY OF FINITE FIELDS 267 


has no solution in K; and this is equivalent to the statement that the equation 
(9) v(x) = yx? + (6—a)x-B=0 


is irreducible in Ky. 
If namely ¥(x) in (9) has the root p in K;, then W(x) in (7) also has the 
root p and one finds after putting y=x—p 


6 


The second factor in this product is again of the type (8) when z is substi- 
tuted for 1/y. 

We suppose, then, that (9) is irreducible and has the two roots y; and yo. 
This corresponds in our first special case to the assumption that the poly- 
nomial (4) is irreducible. In this case ¥(x) has no linear factors. 

From (6) one obtains through iteration 


AnXx + Bn 
+ 5, 


and one verifies that the coefficients of this substitution are given by* 


(10) = 


— we)an = (a — — (a — 
— w2)Bn = — wf), 


— w2)¥n = — we"), 


(11) 


— we)d, = (@1 — — (we — 


where w; and w, are the roots of 
(12) w(x) = x? — (a+ 5)x + ad — By = 0 


and hence 


= yi + 4, we = + 5. 


Now let be the degree of an irreducible factor of ¥(x). Then x is the small- 
est number such that the roots of the factor satisfy the equation 


+ Bn 
YnX + 
If y.~0 one finds that a solution p of (13) is also a solution of (9), and since 


(13) 


* These expressions were given by Serret, Sur les fonctions rationnelles linéaires prises suivant un 
module premier etc., Comptes Rendus, Paris, vol. 48 (1859), pp. 112-117. 


| 
! 
: 
: 
ia 
a6. 


268 OYSTEIN ORE [April 


such a root cannot be a root of ¥(x), we shall have to suppose y, = 0 and hence 
according to (11) 


In this case one obtains from (11) that the right-hand side of (10) reduces to 
x and it follows that the degree of any factor of Y(x) is the smallest exponent 


such that (14) is satisfied. Since the number w;/w2 = ¢; is a root of the congru- 
ence 


(a + 6)? 


2 1 
)e+ 


(15) = +( 


and since the irreducibility of (9) follows from the irreducibility of (15), we 
conclude: 


THEOREM 6. Let 
W(x) = + — ax — B, y ~ 0, 


be a polynomial with coefficients in Ky chosen such that the polynomial 


(a + 6)? 


is irreducible in Ky. When (x) belongs to the exponent n, then V(x) is the prod- 
uct of (p’+1)/n irreducible factors of degree n, and when $(x) belongs to the 
maximal exponent p/ +1, V(x) is irreducible. 


It is also possible to give the complete prime polynomial decomposition 
of ¥(x) and hence to exhibit explicitly irreducible polynomials having a de- 
gree equal to an arbitrary divisor of ~’+1. One finds, namely, that V(x) may 
be brought into the form 


(16) W(x) = a(x — + — 


where y; and y2 as formerly denote the roots of (9), while —a1/az is a root ¢1 
of (15). From (16) we obtain the decomposition 


(17) W(x) = TI] (or (x — va)” + 


where p; and p.‘ are two conjugate elements in the field Kz, such that’ the 
quotient —ps /p; runs through all m =(p/+1)/n solutions of the equation 


(18) 


| 

| x™ = 


1934] THEORY OF FINITE FIELDS 269 


The actual determination of the roots of (18) may be done in the following 
way. Any solution can be represented in the form 


= (7 + w2)/(r + 1) 
and the equation (18) takes the symmetric form 


(19) 


If we suppose p¥2 we can write 


+ w1)™ = we(r + we)”. 


= A + we = A — 


+ By, 


= 


and to satisfy (19) we must have 
m 
2 


(20) 


+A +4 + + ) =0. 


This congruence must have m different solutions and each solution deter- 
mines a factor of Y(x) in (16). 

One can, however, derive these irreducible factors in rational form in a 
different way, which more clearly shows their relation to the linear substitu- 
tions. Let the numbers a, 8, +, 5 satisfy the conditions of Theorem 6 and let 
us construct the expression 


x 
R(x) = 
yx+ Yn—1% + 


For a root A of W(x) we have 
RA) =A MW = 


where a; is the coefficient of x*-' in the corresponding irreducible factor in 
(16), hence 


a= 
pi + 


Since the equation of mth degree 
R(x) 


; 

f 
tow 
Ag 

ive 
ut 
a 

t 
at 
7 


270 OYSTEIN ORE [April 


is satisfied by all roots of the irreducible factor of ¥(«) having the coefficient 
a, we have 


THEOREM 7. Let p#2, and let a, B, y, 6 be chosen such that the polynomial 
(15) is irreducible in Ky, while the order of the linear substitution 


ax +B 
yx +6 
is n, where n-m=p! +1. The equation of nth degree 


ax + B An—1% + Bn—1 n 


Q(x) = 


is then irreducible for all r‘*) satisfying (20). 


One sees from the proof of this theorem that ¥(x) may be represented as 
the product of factors P;(x) where 


P(x) = (yx + 8) + 
It should also be observed that one can obtain similar results through a con- 
sideration of the product of the linear transformations. 
CHAPTER IV. MISCELLANEOUS THEOREMS ON HIGHER CONGRUENCES 


1. Elements with unit norm. We shall now deduce a few results which 
may be considered as the rudiments of the class field theory in finite fields. 
We show first 


THEOREM 1. The necessary and sufficient condition that a number a in the 
field Kyy satisfy the relation 


(1) = | 
is that a be representable in the form 
2) a = 


It is obvious that every element of the form (2) satisfies (1). On the other 
hand, one finds that (1) represents the necessary and sufficient condition that 
x?!’ x be symbolically right-hand divisible by x’ —ax, and hence when (1) 
is satisfied the equation 


(3) x? —ax=0 


has a solution 80 in Ky. 
If one wishes to determine the form of the number 6 in the representation 
(2), we divide x’ —x left-hand by x” —ax and find 


1934] THEORY OF FINITE FIELDS 271 


(4) 


where 


— = (x? — ax) X Q(x), 


The relation (4) shows that 


B = 


where w is an arbitrary element in K,y such that Ow) <0. 
The condition (1) may also be written 


Nj(a) == a:aPl. eee = 


and Theorem 1 is seen to be the analogue of the well known theorem on cyclic 
fields, that every element whose norm is unity may be represented as the quotient 
of two conjugate elements. The ordinary proof for this theorem could not be 
applied in our case, since it requires that the field contain an infinite number 
of elements. 

One may also state Theorem 1 in the following equivalent form: 


THEOREM 2. The necessary and sufficient condition that an irreducible poly- 
nomial f(x) of degree n with coefficients in K; belong to an exponent N dividing 
(p'"—1)/(p! —1) is that the last coefficient a, in f(x) be unity, and in this case 
every root p of f(x) may be represented in the form 


p=a"/a 
where o is an element of Knyg. 


One may also express Theorem 1 in a somewhat more general form. Let 
namely 


be a ~/-polynomial dividing x°”’ —x, and let 

(6) — = F(x) X G(x). 

Expressing the condition that x*/’—ax be a right-hand symbolic divisor of 
F(x), we find 


THEOREM 3. Let F(x) be a p/-polynomial given by (5) and let G(x) be its 
complementary polynomial such that (6) is satisfied. An element a in Kyy satis- 
fying the condition 

is then representable in the form 


a = 
where B is a root of F(x) =0, hence B=G(w) for a primitive element w of Kyy. 


vet 


272 OYSTEIN ORE [April 


2. The law of reciprocity. There exists for higher congruences a very 
simple and general law of reciprocity. This was first pointed out by F. K. 
Schmidt*, although special instances of it were already known to Dedekind.t 
Recently the theorem has been rediscovered by Carlitzf, who seems to have 
overlooked the paper of Schmidt. Carlitz gives two different proofs mapped 
on the proofs for the quadratic law of reciprocity. In the following I givé a 
new and very simple proof for the law of reciprocity in its most general form. 

Let d be a divisor of p/—1 and let 


d-6 =p —1. 
The equation 
(7) 
is then solvable and has the d roots 
(8) 


in K,. We define the field K,y over Ky through a root w of the irreducible 
equation 


f(x) = +--+ + t+ a, 
where we do not, as usual, suppose that a, =1. Let then 
g(w) = Bow™ +--+ + Bmw + Bm 
be an arbitrary element in K,,, and hence 
(9) = 


where ¢ is one of the roots (8). One may obviously write (9) in the form of a 
congruence 


g (mod f()), 


and when we introduce the dth power residue symbol 


g(x) 
(10) (5) = ¢ (mod f(x)), 


we find that it has the property 


*F. K. Schmidt, Zur Zahlentheorie in Korpern von der Charakteristik p, Erlangen Sitzungs- 
berichte, vols. 58-59 (1928), pp. 159-172. 

+ R. Dedekind, Abriss einer Theorie der hiheren Kongruenzen in Bezug auf einen reellen Prim- 
zahlmodulus, Journal fiir Mathematik, vol. 54 (1857), pp. 1-26; Werke, vol. 1, pp. 40-67. 

tL. Carlitz, The arithmetic of polynomials in a Galois field, American Journal of Mathematics, 
vol. 54 (1932), pp. 39-50. See also On a theorem of higher reciprocity, Bulletin of the American Mathe- 
matical Society, vol. 39 (1933), pp. 155-160. 


THEORY OF FINITE FIELDS 
Ga). 


(£2) = 
S(x)/a 
is the necessary and sufficient condition that g(x) be a dth power residue 


(mod f(x)). 
This definition (10) gives the dth power residue symbol only for prime 
f(x). In the general case, where f(x) has the prime factor decomposition 


f(x) = filx) + f(z), 


and 


we put 
(11) (<2) (#5) (22) 
To prove the law of reciprocity, let us first consider the symbol for a 
prime f(x). Then according to (10) we obtain 


g(x) 
(12) (*) = (g(w)g(w”’) = ag ™R(f(x), g(x))®, 
f(x) 


where R(f, g) denotes the resultant of the two polynomials. The definition 
(11) then shows that the same formula (12) holds for an arbitrary f(x). For 
the inverse symbol we obtain in the same way 


= BrR(g(x), f(x))® = (— 1) R(f(x), g(x))*, 
g(x)/a 


and hence 
THEOREM 4. For the dth power residue symbol one has the law of reciprocity 


f(x) 
m(pf—1)/d{ * = (—1)™. (PF 


where n and m are the degrees and a and fy are the highest coefficients of the 
relatively prime polynomials f(x) and g(x). 


This proof also suggests generalizations of the law of reciprocity using 
some other symmetric function than the resultant. Let 


|| 
cA 
| 
| 


274 OYSTEIN ORE 


denote a symmetric function in each of the sets u,; and »;, and let us suppose 
in addition that 
(13) Sn,m(U, 0) = Smin(v, 


Various symmetric functions having these properties may be constructed. 
Now let f(x) and g(x) be two polynomials with the roots x, ---, x, and 
V1, °° * Ym, and let us define 


= ee ee 


It is then obvious according to (13) that 


(iat 


YALE UNIVERSITY, 
New Haven, Conn, 


i 


ERRATA IN MY PAPER “ON A SPECIAL CLASS OF 
POLYNOMIALS’* 


BY 
OYSTEIN ORE 


This paper contains a number of disturbing misprints: Equation (2) p. 
560 should read 


Line 17 p. 561 read A,(x) XB,(x) instead of A,(x)B,(x). 

The term perfect (volkommen) in Theorem 1 is used in the sense of Stein- 
itz, Algebraische Theorie der Kérper, edited by Hasse and Baer, pp. 50-51. 

Line 21 p. 562 should read 


F(x) = Qp(x) X (x? — ax) + Ax. 
Equation (9) p. 562 should read 


In the expression line 9 p. 563 the last term should be A“ x. 
Equation (17) p. 564 should read 


F,,(x) = F,-1(x)? 
Line 8 from below p. 574 should read 


Bo (2) X B,(x) = (mod A,(x)). 


In line 2 from below p. 575 the last term is 


(1) qa)? q@) 
Ay B,(x)Ap XA, (x). 


Line 18 p. 576 read x? —w?-'x. 
Line 12 p. 580 read F,(x) =x” XG,(x). 


* These Transactions, vol. 35 (1933), pp. 559-584. 


YALE UNIVERSITY, 
NEw Haven, Conn. 


| 

4 
| 
| 

4 

275 

} 


ALMOST PERIODIC TRANSFORMATIONS* 


BY 
R. H. CAMERON 


1. INTRODUCTION 


When one studies periodic transformations such as, for example, rota- 
tions, he often encounters transformations which are not periodic but which 
are, in a very real and non-technical sense, almost periodic. For instance, re- 
peated rotation through an angle which is an irrational part of a revolution 
will never bring a point set back point-for-point into itself; yet this object 
may be as nearly attained as we please by repeating the process an appropri- 
ate number of times. Moreover, such “appropriate” numbers are relatively 
dense} among the integers. This example suggests the definition of an a.p. 
(almost periodic) transformation; it being only necessary to make precise the 
meaning of “as nearly as possible” when applied to bringing the points of a 
set back into themselves. 

Consider, for example, a set T of uniformly continuous transformations 
which take each point of a complete metric space € into a point of ©. Let T 
contain the identity and the product of any two of its elements. Then if & is 
a variable point of € and (£) and W(£) are any two elements of Tf, let the 
smaller of the two quantities, unity and the least upper bound for all & in 
€ of the distance between the points 6(#) and W(é), be called the distance 
between the transformations ®(£) and W(£), and let it be indicated by 
|| &(£), W()||. Then || #(¢), €|| represents one way of telling how nearly the 
points approximate the points Moreover a transformation of will 
be called a.p. if to each positive number e there corresponds a positive integer 
L so great that among each L successive positive integers there is an integer 
N satisfying || ®¥(£), &|| <e. This is merely an example of a definition of an 
a.p. transformation. A more general definition will be given in the next sec- 
tion. Transformations will be thought of as points in a new space, and a.p. 
points will be defined. Moreover, for simplicity in wording and notation, most 
of the theorems on a.p. transformations will be stated in terms of a.p. points. 
However, the reader may readily re-phrase them in terms of the more natural 
and significant a.p. transformations. 

* Presented to the Society, December 29, 1932, and April 14, 1933; received by the editors 
November 28, 1932, and in revised form, September 6, 1933. 


Tt A set of real numbers is called relatively dense if there exists a positive number L so great 
that every interval of length LZ contains at least one element of the set. 


276 


ALMOST PERIODIC TRANSFORMATIONS 277 


For the sake of generality the concepts of a.p. functions and sequences 
in a complete metric space have been introduced. They include ordinary 
a.p. functions and sequences as special cases; and may also be thought of as 
including a.p. transformations as a special case. However, this work is not 
merely a generalization of the standard theory of a.p. functions and se- 
quences, for my most significant theorem—the climax of the whole theory— 
applies to a.p. transformations (or points) alone, and does not seem to be 
susceptible of generalization to space functions or sequences. The theorem 
to which I refer is Theorem V, §8, which shows that every a.p. transformation 
can be expressed as an infinite product of simpler transformations. 


2. ALMOST PERIODIC POINTS, SPACE FUNCTIONS, AND SPACE SEQUENCES 


DeriniTion. A space f will be called a G-space if it satisfies the postulates 


a. © is metric and complete (Let ||¢, || denote the distance from ¢ 
to y.) 

b. An operation called multiplication is defined so that to each ordered 
pair of points ¢ and y corresponds a unique point ¢y. The operation is as- 
sociative, and the space contains an identity point J. 

c. ||¢0, <||¢, y|| for any three points ¢, y, 0. 

d. The product 6¢ is a uniformly continuous function of the variable 
point ¢ for each point 0. 


TueoreEM I. In any G-space, =||¢, if exists.* 


THEOREM II. In any G-space the product 0¢ is a continuous function of the 
points 6 and ¢. 


DEFINITION. Let a positive number e and a point ¢ of the G-space T be 
given. Then an integer N which satisfies ||¢¥, I || <e is called an e-iteration 
exponent of d. Moreover a point y of TF is called a:p. if to each positive num- 
ber e there corresponds a positive integer L so great that among every L suc- 
cessive positive integers there is an e-iteration exponent of y. 

DeriniTIon. Let each point of a G-space TE be a (1, 1) transformation 
which takes a set or space € into a subset of itself. Then an a.p. point ¢ of 
TX will be called an a.p. transformation. 

It can readily be verified that this definition includes the special case of 
the definition given above. Moreover a.p. points are no more general than 
a.p. transformations, for if a G-space TF is given, a space T’ of transformations 
can always be set up isomorphic with T. Merely let ¢’ = é correspond to the 
point ¢. 


* The symbol @~ denotes a point which satisfies the equations 007 = 610 =. 


| 
| 
| 
| 


278 R. H. CAMERON [April 


As an example of a G-space, consider the set of all complex numbers whose 
absolute value is unity or less. Take multiplication and distance in their 
ordinary sense. In this space, all the numbers whose absolute value is unity 
are a.p. points. 

Another example of a G-space is the set of all complex numbers; the prod- 
uct of two points being the sum of the numbers. In this case the unit point 
(the number zero) is the only a.p. point. 

Derinition. Let S be a complete metric space, and #(#) a function de- 
fined over a set of real numbers and having its set of values in S. Let s be 
a real number such that ¢+s is in the set of definition of #(#) for all values of 
t in that set of definition. Then s will be called an e-translation number of 
(t) if the distance between ®(#) and @(¢+s) is never greater than the posi- 
tive number e. 

DeFIniTIon.* A continuous space function ®(¢) of the real variable ¢ 
which has its set of values in a complete metric space © is called a.p. if its 
e-translation numbers corresponding to each positive e are relatively dense. 
If each point of S is a transformation of the points of a set € into a subset of 
€, #(¢) is called an a.p. family of transformations. 

Deriniti0n.* Let {I} be a two-way sequence of points in a complete 
metric space S. Then if the e-translation numbers of I’, considered as a func- 
tion of ” are relatively dense for each positive e, I’, is called an a.p. sequence. 
If each point of S is a transformation, I’, is called an a.p. sequence of trans- 
formations. 


THEOREM III. An a.p. space function is uniformly continuous for all values 
of its argument. 


THEOREM IV. An a.p. space function or sequence is bounded. 


These two theorems can be proved in the same way as the special case of 
numerical a.p. functions or sequences. f 


THEOREM V. In the complete metric space © let {T',} be an a.p. sequence of 
points such that the distance between T, and T,, equals the distance between T' m1 
and I’; for every pair of integers m and n, and let & be the subspace consisting 
of the points T,, and their limit points. Thent 


* The author is indebted to Dr. I. J. Schoenberg for the suggestion which led to this generaliza- 
tion of a.p. transformations. 

t H. Bohr, Zur Theorie der fastperiodischen Funktionen, Acta Mathematica, vol. 45, pp. 29-127, 
especially pp. 35 and 36. 

t The notation lim 0(é)=8 

means that to each e>0 there Pen d>0 such that every point £ which satisfies || o(é), al| sd 
also satisfies <e. 


ALMOST PERIODIC TRANSFORMATIONS 


exists and is a point of = whenever > and y are both points of T. Moreover if we 
let (1) define multiplication, the space Z will be a G-space having T; as an a.p. 
point. 


The existence of ¢y follows from the inequality 


and the completeness of T. It is easy to verify the fact that T is a G-space, 
and since ' =T,, the translation indices of the sequences are iteration ex- 
ponents of I. 

3. ALMOST PERIODIC CONTINUATION 


DEFINITION. Let $ be the set a<t<+o or the set —~ <t<+o. Let 
§ be an infinite set of real numbers such that the sum and difference of any 
two numbers in § is in ®; then a function ®(é) defined on the intersection 
$8 and having its set of values in a complete metric space S will be called 
asymptotically periodic if there exists a sequence of positive numbers 
51, Sz, in such that as m—o and such that uniformly in ¢ on 


SEK, 


lim (¢ + s,) = ®(#). 


n—> 


Lemma 1. If ®(t) is asymptotically periodic on SS, then there is one and 
only one asymptotically periodic function V(t) which equals ®(t) on JR but is 
defined on where is an interval containing 


For if ¢ is any number in &, for sufficiently great integers m and m the 
numbers ¢+Sm, t+5n, £+5m+5, are all in ¥R, and 


| + sm), + || OE sm + + 
+ + + sx), + 


It follows from the hypothesis concerning ® that the second member ap- 
proaches zero as m and m approach infinity, and hence that 


V(t) = lim + s,) 


no 


is defined for all on &; and we have V(#) = &(#) on ¥R. Now corresponding to 
an arbitrary e>0 there exists an integer N so great that for all ¢’ on $& and 
any 


+ sa); &(¢’)|| e. 


1934] 279 

| 

} 

| 

| 

| 


280 R. H. CAMERON 


Thus for any ¢/ on & and any m>N, 
+ Sn), = lim || + Sa + Sp), Se, 
pro 


lim V(t + s,) = 


uniformly over the whole set &. Finally suppose @(#) is equal to S(#) on JK 
and satisfies uniformly on its set of definition )R the equation 


lim + = Q(2), 


where s,->©. Then on 9)&, for sufficiently great m and n, 


¥(2), < VE + + + sn), + + 
i+ + sa + sn), VE + + + 52), 


and since the second member approaches zero as m approaches infinity, 
Q(t) = V(t) on YR, and the lemma is proved. 

THEOREM I. An a.p. space function or space sequence is completely deter- 
mined by its values on any half infinite interval. 

THEOREM II. An-a.p. point or transformation $ possesses an inverse p= 
(in the case of the transformation, a single-valued inverse over its whole set of 
definition), and its integer powers form an a.p. sequence or sequence of trans- 
formations. 

For if NV, is a (1/n)-iteration exponent of ¢ and its value is greater than 
n for each positive n, then uniformly for non-negative k, 


lim pNatk = o* 


and ¢" is an asymptotically periodic function ¢, of the non-negative integer 
n. But by the lemma, ¢, is defined for all integers m, and hence 
o¢_1 = lim =J = lim = $-19; 


so that ¢,=¢" for all integers m. Moreover negative translation numbers 
for exist, since ||¢*-", =||¢*, o***||. 


4. FOURIER SEQUENCES 


In the future it will often be convenient to state two or more theorems or 
definitions at once; and this will be done by the use of brackets. Where alter- 
native words or sets of words are to be used, both alternatives will be inserted 
in the brackets and separated by a semi-colon. If no words are needed for 


[April 
and 
| 
no 


1934] ALMOST PERIODIC TRANSFORMATIONS 281 


one of the alternatives, that will be indicated by a dash. In reading the the- 
orem, read one set of words taken in the same relative position from each 
pair of brackets. Parenthetical expressions are indicated in the ordinary way 
and have nothing to do with the brackets. 

Derimnition. A finite or infinite sequence fi, fe, - - - of real numbers will 
be called an upper Fourier sequence of [a continuous space function; a two-way 
space sequence; a point in a G-space] O if to each e>0 correspond a positive 
integer NV and a positive number d such that any [number; integer; integer | ¢ 
whose multiples ¢f;, tf2, - - - , {fy all differ from integers by less than d is an 
[e-translation number; e-translation index; e-iteration exponent] of 9. 

DeEFiniTIon. A sequence will be called a lower Fourier sequence of @ if to 
each positive number d and positive integer NV (not greater than the number 
of elements f;) corresponds a positive number e such that all the multiples 
if:, tfe, - - - , fy of any [e-translation number; e-translation index; e-iteration 
exponent | ¢ of © differ from integers by less than d. 

The relationship between upper and lower Fourier sequences will be given 
in Theorem V. 

DerFiniTion. A sequence which is both an upper and a lower Fourier se- 
quence is called a Fourier sequence. 

THeoreM I. Every a.p. [function; sequence] in [Bohr’s; Walther’s| sense 
has at least one Fourier sequence. 


In the case of the function, two of Bohr’s* theorems show that a Fourier 
sequence can be obtained by dividing each Fourier exponent by 27 and ar- 
ranging them in countable order. Moreover, Walthert has shown how to 
construct corresponding to a given a.p. sequence an a.p. function whose set 
of integer e-translation numbers corresponding to each given e>0 will be 
identical with the set of e-translation indices of the sequence. Thus a Fourier 
sequence of the function will be a Fourier sequence of the sequence. 

DerFiniTI0n. If s and ¢ are variable reai [numbers; integers] and @(#) 
is an a.p. space [function; sequence], the real [function; sequence] 
f() =sup O(s+é), O(s) is called the Bochner translation | function; sequence | 
of O(2). 

THEOREM II. The Bochner translation [function; sequence] of a given a.p. 
space [function; sequence] is a.p., and its set of e-translation [numbers; indices] 
for each given e>0 is identical with the set of e-translation [numbers; indices] of 
the given space [function; sequence |. 

* Zur Theorie der fastperiodischen Funktionen, Acta Mathematica, vol. 46 (1925), pp. 101-214, 


especially pp. 105 and 110. 
t Fastperiodische Folgen und ihre Fouriersche Analyse, Atti del Congresso Internazionale dei 
Matematici, 1928 (VII), vol. 2, pp. 289-298, especially p. 290. 


| 
| 


282 R. H. CAMERON [April 


From Theorems I and II of this section and Theorem II, §3, we have im- 
mediately the following theorem which is fundamental in this work: 


THEOREM III. Every a.p. [space function; space sequence; point| has at 
least one Fourier sequence. 


THEOREM IV. The necessary and sufficient condition that a [continuous 
space function; space sequence; point in a G-space| be a.p. is that it have an 
upper Fourier sequence. 


Here sufficiency follows from Wennberg’s* theorem on Diophantine ap- 
proximation. 


THeoreM V. Each element f, of a lower Fourier sequence of an a.p. [space 
function; space sequence; point] © is linearly dependent with integer coefficients 
on a finite number (dependent on p) of the elements of any upper Fourier se- 
quence fi , ft, - - - of O[——-; and unity; and unity]. 

For let M be a positive integer greater than unity. Let ey be a positive 
number such that éf, differs from an integer by less than 1/(2M) whenever 
t is an ey-[translation number; translation index; iteration exponent] of 9. 
Let dy be a positive number less than 1/(2M) and Ny a positive integer such 
that ¢ is an ey-[translation number; translation index; iteration exponent | 
of @ whenever the numbers #f/, if/, - - - , ify, all differ from integers by less 
than dy. Then there exists no number ¢ at all which will make éf,+1/M, 
ifi, iff, +--+, tfuy all differ from integers by less than dy. Now according 
to a theorem of [Bohr}; Giraudt; Giraudt], a necessary and sufficient condi- 
tion that there exist values of the variable [number; integer; integer] ¢ which 
bring a given set of linear functions a,¢+0; (¢=0, 1, - - - , Q) arbitrarily close 
to integers is that every set of integer multipliers go, gi, - - - , gg which make 
the quantity >-?.,g,a; become [zero; an integer; an integer] should make the 
quantity >-?.,¢.0; an integer. In the present case arbitrarily good approxi- 
mating values of ¢ do not exist if we put ao=f,, ai=f{, ~~~, @vy=fny and 
bop =1/M, bi: =0, - - - , =0; hence the condition is not satisfied, and there 
exist integers gu, £m.1, °° » Such that 


Nu 
gufp + gus fi 
i—1 


* Zur Theorie der Dirichlet’schen Reihen, Dissertation, Upsala, 1920, p. 19. 

1 Neuerer Beweis eines allgemeinen Kronecker’schen Approximationssatzes, Det Kgl. Danske 
Videnskabernes Selskab, Mathematisk-F ysiske Meddelelser, vol. 6 (1924-25), Article 8. 

t Sur la résolution approchée en nombres entiers d’un systéme d’équations linéaires non homogéenes, 
Société Mathématique de France, Comptes Rendus des Séances, 1914, pp. 29-32. 


1934] ALMOST PERIODIC TRANSFORMATIONS 283 


is [zero; an integer; an integer] and such that gy/M is not an integer. Thus 
each of the quantities g2f,, gsfp, - - - can be expressed in a finite linear combi- 
nation of [— —; unity and; unity and] the quantities f{, f/, - - - with in- 
teger coefficients. Now if ki, ke, - - - , &, are the prime factors of gs, unity can 
be expressed as a finite linear combination of go, Zi. Ska» Since 
gu/M is not an integer; and hence f, can be so expressed in terms of gof», 


THEOREM VI. A sequence whose elements are linearly dependent with integer 
coefficients on [— —; unity and; unity and] a finite number of the elements of 
a lower Fourier sequence of an a.p. [space function; space sequence; point] 
© is itself a lower Fourier sequence of @. A sequence on a finite number of whose 
elements [— —; and unity; and unity] each element of an upper Fourier se- 
quence of © is linearly dependent with integer coefficients is itself an upper 
Fourier sequence of ©. 


For a linear combination of numbers with integer coefficients can be 
brought as close to an integer as we please by bringing the numbers suffi- 
ciently close to integers. The last two theorems lead immediately to the 


THeEoreEM VII. Let fi, fo, - - - be a Fourier sequence of an a.p. [space func- 
tion; space sequence; point| ©. Then a necessary and sufficient condition that a 
sequence fi, fz, - - - should also be a Fourier sequence of © is that each f, be 
linearly dependent with integer coefficients on a finite number of fi, fz, -- - 
[— —; and unity; and unity], and vice versa. 


DEFINITION. A number module is a set of real numbers which forms a 
group under the operation of addition. It is called complete if it contains the 
number unity; otherwise incomplete. A denumerable number module which 
when arranged as a sequence constitutes a Fourier sequence is called a 
Fourier module. 

Obviously any countable arrangement of a Fourier module is a Fourier 
sequence. 


THeoreM VIII. Each a.p. [space function; space sequence; point] has one 
and only one [— —; complete; complete| Fourier module. 


For the [function; sequence; point] has a Fourier sequence fi, fe, - - - . 
Let ¢ be the set of all numbers which are linearly dependent with integer 
coefficients on a finite number of the f; [— —; and unity; and unity]. By 
Theorem VII, a sequence obtained by ordering ¢ is a Fourier sequence; 
hence ¢ is a Fourier module. Now if $’ is any [— —; complete; complete] 


| 


284 R. H. CAMERON : [April 


Fourier module, it is linearly dependent on ¢ [— —; and unity; and unity] 
and vice versa, and must therefore be ¢. 


5. SCALARS 


Derinition. A finite or infinite sequence of real numbers will be called a 
scalar, and the number of elements in it (which may and usually will be the 
symbol ) will be called its length. The [sum; product] of two scalars of the 
same length or product of one scalar by a number is the scalar obtained by 
adding; multiplying] corresponding elements or multiplying each element 
by the number. The scalar identity. is the sequence 1, 1, 1, - - - ; and the 
scalar zero (indicated by an ordinary zero) is the sequence 0, 0, 0, - - - ; for 
both « and 0, the length of the scalar will be indicated by the context. Scalars 
will be indicated by small Greek letters and their elements by corresponding 
italics, thus: @; a1, 

DEFINITION. The absolute value |a| of an infinite scalar a; a1, dx, - - - will 
be the greatest lower bound for all positive integers m and k of 


max |a;+ &lil. 
n k O0<jsn —x<i<to 
The absolute value |ai| of a finite scalar a; a1, - - - , ap will be the greatest 
lower bound for all positive integers & of 
1 
—-+ max-min |a; + kiil. 
0<jsp 

Using la— B| as the distance between a and 8, one can verify that the 
set of all scalars of the same given length is a metric space. 

DEFINITION. A reduced upper Fourier sequence of a [space function; space 
sequence; point in a G-space] is a sequence on a finite number of whose 
elements each element of an upper Fourier sequence is rationally linearly 
dependent. A base is a reduced upper Fourier sequence every finite subset of 
which is rationally linearly independent. A base is called minimal if each of 
its elements is rationally linearly dependent on a finite number of the ele- 
ments of an incomplete Fourier module, or in case none exists, the complete 
Fourier module. A base for a space sequence or a point in a G-space is called 
proper either if it contains a rational element or if unity is not rationally 
linearly dependent on any finite subset of its elements. 

It follows from Theorem IV, §4, that the statements that a |continuous space 
function; space sequence; point in a G-space| has a reduced upper Fourier 
sequence, has a base, has a proper minimal base, or is a.p. are all equivalent. 


, 1934] ALMOST PERIODIC TRANSFORMATIONS 285 


TueoreM I. If s and t are variable real [numbers; integers|, a necessary and 
sufficient condition that the scalar y be a reduced upper Fourier sequence of the 
[continuous space function @(t); space sequence @,=O(t) | is that uniformly in s 


lim O(s + #) = O(s). 


To prove sufficiency, let fi, fe, - - - be an upper Fourier sequence of ® 
whose elements are rationally linearly dependent on y:c1, cz, - - - . By making 
|ty| sufficiently small, an arbitrarily large number of the quantities éc;, 
ice, - - - can be brought arbitrarily close to multiples of an arbitrarily large 
k!. There exist integers p; such that each #,f; is a finite linear combination 
of the c; with integer coefficients. Thus an arbitrarily large number of the 
quantities ¢p,f; can be brought arbitrarily close to multiples of k!, and by 
choosing & large enough so that k! contains all of the corresponding ; as 
factors, arbitrarily many of the if; may be brought arbitrarily close to 
integers. Hence ¢ will be an arbitrarily good translation number. To prove 
necessity we need only notice that the integer sub-multiples of the elements 
of y when arranged in countable order form an upper Fourier sequence. 

From the above theorem and Theorem II, §3, we obtain 


THEOREM II. If n is an integer, a necessary and sufficient condition that a 
scalar y be a reduced upper Fourier sequence of the point A in a ©-space having 
the identity point I is that 


lim A* = [. 
nmy—0 


6. ALMOST PERIODIC PROPERTIES INVARIANT UNDER MULTIPLICATION 


I. If [®(é); A] is an a.p. [function; sequence; point] in a 
G-space, then [&(#)O; T,Q@; A"Q] is a uniformly continuous function of the 
point © uniformly with respect to [t;n; n]. 


For any finite set of values of ¢ or n, the theorem is obvious. Since ®(#) 
is uniformly continuous, for an arbitrary e>0O we can divide any finite 
interval $ up into a sufficiently large number of equal intervals so that on 
any such interval ®(#) varies by less than e/3. After choosing a point 4; 
from each interval, we can bring all the points ®(¢;)0’ within a distance of 
e/3 from the corresponding points &(¢;)@’’ by bringing 9’ sufficiently close 
to Thus we can bring 0’ within eof &(#)0”’ for all ton by bringing 
0’ sufficiently close to 0’’. Thus the theorem is true for functions, sequences 
or points on any finite interval. That it is also true for the infinite interval 
can be seen by choosing a length L corresponding to e>0 so great that on 
any interval of this length there is always an (e/3)-translation number or 


| 

| 

| 

{ 

| 


286 R. H. CAMERON [April 


index or iteration exponent. Then [(é); I',; A*] for any value of [#; 1; 2] 
can be replaced by [®(t); I',,; A**] with an error not greater thane/3, 
where [fo; 20; mo] lies between 0 and L. But by bringing 9’ and 0” sufficiently 
close, [@(t.)®’; T,,0’; A"°@’] can be brought within a distance of e/3 from 
for all [to; 0; 20] between zero and L at once. 
Thus [#(/)0’; T,,0’; A"@’] would be at a distance not greater than e for 
all [t; 2; n] at once from [#(#)0’’; T,0’’; A"@’’]. 


THEOREM II. The product of any two [— —; — —; permutable] a. p. 
[functions ; sequences; points | in a G-space is a.p. 


For let s and ¢ be real [numbers; integers; integers] and let @,(é) and 
@.(?) be the [a.p. functions; a.p. sequences; éth powers of the a.p. points] 
having the reduced upper Fourier sequences 7; and 2. Let y* be a sequence 
whose elements comprise all the elements of 7 and 72. Then both ty; and ty2 
can be brought arbitrarily close to zero by bringing ¢)* sufficiently close to 
zero. Thus, uniformly in s, 

lim 0,(s + = O,(s) and lim @2(s + #) = @2(s). 

ty*-0 
Then since @,(#)T is uniformly continuous in I uniformly with respect to #, 
it follows that uniformly in s 


Tim |]Ox(s + + 1); Ox(s + )Ox(s)|] = 0 


and hence that uniformly in s, 
lim ||Ox(s + £)@2(s + 2); @1(s)@2(s)|| = 0. 


Now we note that in case @,(/) and ©,(#) represent the ‘th powers of 
permutable points, 0,(#)@.(¢) represents the ‘th power of the product of the 
points; and hence in all three cases our theorem is proved. 


COROLLARY. A sequence whose elements comprise all the elements of 1 
and 2 which are reduced upper Fourier sequences of two a.p. | functions; 
sequences; permutable points | in a G-space is a reduced upper Fourier sequence 
for the product of the |functions; sequences; points |. 


DEFINITION. A sequence of points ‘Ai, Ae, - - - will be said to converge 
exponentially uniformly if Ai", As", - - - converges uniformly with respect 
to n for all integers n. 


THEOREM III. If a sequence of a.p. [space functions; space sequences; 
points | converges [— —; — —; exponentially] uniformly, its limit is a.p. 


1934] ALMOST PERIODIC TRANSFORMATIONS 287 


For, using the natural extension of the notation of the last theorem, it 
follows from the fact that lim,.,. O,(é) is uniform in ¢ that 
lim lim 9,(s +2) = lim lim 0,(s +2) = lim @,(s), 


where the limits with respect to ¢ are uniform in s. 


Coro_iary. If 71, are reduced upper Fourier sequences for an 
; exponentially | uniformly convergent sequence of a.p. [space func- 
tions; space sequences; points| and y* is a sequence which has each of the y; as 
subsequences, then y* is a reduced upper Fourier sequence for the limit [func- 
tion; sequence; point]. 
THeEoreEM IV. The [— —;— —; exponentially | uniform limit of an infinite 
product of |— —; — —; permutable] a.p. [space functions; space sequences; 
points | in a G-space is a.p. 


7. PSEUDO-ARGUMENTS, INDICES, AND EXPONENTS 


Derinition. If [O(é); T',; A] is an a.p. [space function; space sequence; 
point | having the base 7; and if a is any scalar of the same length as 7, then 
the symbol [@(a),; (T'.),; will denote 


[lim lim [,; lim A*] 


byary ny—ary 


and will be called the pseudo- [value of the function; element of the sequence; 
power of the point| corresponding to the pseudo- [argument; index; exponent | 
a with respect to the base y. The base y will be omitted from the notation when 
the context makes clear what base is to be used. If the points of the space are 
transformations, the same nomenclature will be used except that for a family 
of transformations the terms pseudo-value or argument will be replaced by 
pseudo-member or parameter. A pseudo- [element; power] of an a.p. [space 
sequence; point] with respect to a base ¥ will be called proper if y is a proper 
base and no non-integer element of the pseudo- [index; exponent] corre- 
sponds to a rational element of y. 


TueEoreEM I. All [— —; proper; proper] pseudo- [values; elements; powers | 
of an a.p. [space function; space sequence; point | exist. 

For if ¢ is a variable real [number; integer; integer] and a is any scalar 
[argument; index; exponent] which satisfies the hypothesis, then ay can be 
approached arbitrarily closely by the scalar ¢y; and if @(é) is the [function; 
sequence; ‘th power of the point], then uniformly in é 

lim O(s — s’ +24) = 01); 


Sy, 


| 

i 

| 


R. H. CAMERON 


|O(s); O(s’)|| = 0 


lim 
Sy, 
and the theorem follows. 


Tueorem II. If t is a real [number; integer; integer| and « is the identity 
scalar of the same length as the base v of the a.p. [space function Q(t); space 
sequence T',; point A], then [OQ(u),; AX] is the same point as [@(#); 

THEOREM III. The [— —; proper; proper| pseudo- [value O(a),; element 
(T.)y; power of the a.p. [space function Q(t); space sequence point A] 
is a uniformly continuous function of the scalar ory for all [— —; admissible; 
admissible | values of a. 


For in the case of the function having the base y, to a given e>0 corre- 
sponds d>0 so small that for all ¢ and ¢’ satisfying | (¢—¢’)y| <d, 
(1) < e. 


Now let @ and a’ be any scalar satisfying |(a—a’)y| <d. Then when ty 
and ¢t’y are sufficiently close to ay and a’y respectively, the equation (1) 
is satisfied; and hence 


|O(a); O(e’)|| = lim ||O@; 


by 


Similar arguments show that the theorem holds for sequences and points also. 


THEOREM IV. Jf Az and A® are proper pseudo-powers of the a.p. point A 
taken with respect to the same base, then 


= Aoté, 
For 
lm <A*= lim A™** = lim A®™ lim 
ny—(atB)y my ary ny—By my—ory ny—By 

CoroLiary 1. Any two proper pseudo-powers of an a.p. point are permut- 
able. 

Coroiiary 2. If A* is a proper pseudo-power of the a.p. point A, then 
(A2)*= Ane, 

THEOREM V. If y is a base for the a.p. [space function Q(t); space sequence 
T’,; point A] and [a and B are any scalars; T, and YT, are proper pseudo-ele- 
ments; A* is a proper pseudo-power|, then [O(ta+B); Tnasg; A*] is a.p. and 
has the reduced upper Fourier sequence ary. 


288 P| [April 
so that 


1934] ALMOST PERIODIC TRANSFORMATIONS 289 


For in the case of the function, as tay—0, [(s+i)a+8]y—[sa+B ly 
uniformly in s; and since @(a) is uniformly continuous in ay it follows that 
uniformly in s 


lm O[(s+ta+ 6] = O[sa + B]. 
bery—0 


The proof is essentially the same for the functions and sequences. 


8. MONO-BASAL FUNCTIONS, SEQUENCES AND POINTS 


Notation. Let w, denote the scalar which has its pth element equal to 
unity and all other elements zero. Its length will be indicated by the context. 

Derinition. A [space function; space sequence; point] is called mono- 
basal if it is a.p. and has a base consisting of but one element. 


THEOREM I. Any pure periodic [space function; space sequence; point| 
is mono-basal, having unity as a base. 


THeEoreEM II. If the a.p. [space function O(t); space sequence T,; point A] 
has the —; proper; proper| base y:¢i, 2, --~- and [B is any scalar; (Ts), 
is any proper pseudo-element; — —], then [O(twp+B)y; Ay*?] is 
mono-basal and has cy as a minimal base. 

For it has w,y as a reduced upper Fourier sequence. 

Derinition. A space [function; multiple sequence] of any countable set 
of [variables; indices] will be called mono-basal in any one of its [variables 
t; indices n] if it is a mono-basal [function of t; sequence in »] for each set 
of constant values of the other [variables; indices]. The diagonal | function; 
sequence| of the [function - - - ); multiple sequence | is 
the [function t, - - - ); sequence | 


THEOREM III. An a.p. space [function; sequence| can be expressed as the 
diagonal |function; sequence| on a mono-basal space |function; multiple se- 
quence | of a countable set of variables. 


For let @(#) be the a.p. [function; sequence] of the real [number; integer] 
t. Then O(t)=O(h) ---) is the diagonal of O(twithws 
+ ---), which is mono-basal in each of its arguments. 

THEOREM IV. If A is an a.p. point, then with respect to any proper base the 
infinite product or converges absolutely and expo- 
nentially uniformly to the value A. 

For nwi+nw.+ --- converges uniformly in to m, and hence 
Anrer+ne+--+ converges uniformly in m to A*. Moreover changing the order 
of the w;, we, - - - would not destroy the convergence. 


290 R.H. CAMERON) [April 


Because of its importance in this work, the following theorem will be 
stated in terms of both points and transformations. 


THEOREM V. A necessary and sufficient condition that a |transformation; 


point | be a.p. is that it be the exponentially uniformly convergent infinite product 
of permutable mono-basal [transformations; points]. 


9. EXAMPLES 
I will bring this paper to a ciose by giving two examples of a.p. trans- 
formations. 
I. Let A; be a two-way sequence of non-negative real numbers such that 
> -t--..Ax converges. Let the space € have as its points the complex functions 
f(x) of a real variable x having the Fourier series 


f(x) = > a,e***, 
k=—oo 


where the a, are numbers satisfying | a,| < Ax. Let the distance between any 
two transformations F(x) = @,[f(x)] and F(x) =@2[f(x)] be 


max | @:[f(x)] — Oa[f(x)]] 


Let c, be any two-way sequence of complex numbers each having the 
absolute value 1, and let 


1 
g(y,#) = — 
2r k=—co 


be a function of the real variables y and ¢. Then the transformation 


(1) = F(x) = lim J fle — 


is a.p. 

For the set of transformations of the form (1) which is obtained by using 
all possible sets of values for the coefficients c, of the function g(y, #) is a 
G-space. Moreover it can be shown that 


®[f(x)] = 


and hence that 


log co log ¢1 log c_1 log ¢2 
, 


is a reduced upper Fourier sequence of ®. 


~ 


1934] ALMOST PERIODIC TRANSFORMATIONS 291 


II. Let c1,:c2, - - - be an infinite sequence of distinct complex numbers 
each having the absolute value 1, and let }\ Ax be a convergent series of non- 
negative real numbers. Let € have as its points all functions f(z) of the com- 
plex variable z of the form 


(1) fe) = Dawe, 
k=l 


where the a; are any complex numbers satisfying | a,| < Ax. 
Let the distance between two transformations 


F(z) = O,[f(@)] F(z) = @,[f(z)] 


be the least upper bound for all functions f(z) in © of }-7-1 |a,—a’’|, where 
a, and a; are the coefficients of the series of the form (1) for @,[f(z)] and 
@.[f(z) ]. Then the transformation 


d 
@[f(z)] = F(z) = 


is a.p. 

For it can be shown that each function of € has a unique representation 
of the form (1). Let us associate with each sequence of numbers ci, c#, - - -, 
each of whose absolute values is unity the transformation which takes each 
function of the form (1) into the corresponding function 


F(z) = > CK 
k=l 
This set of transformations is a G-space, and it contains the transformation 
@ which corresponds to G4, ¢:,---and has the reduced upper Fourier 


sequence 
log ¢1 log 


2ri 


CoRNELL UNIVERSITY, 
Irnaca, N. Y. 


THE BERTINI TRANSFORMATION IN SPACE* 


BY 
F. R. SHARPE anp L. A. DYE 


1. Introduction. Examples are known of involutorial transformations J 
having an invariant pencil of planes through a line /, such that in each plane 
of the pencil there is a transformation of the Geiser or the de Jonquiéres type. 
We have shown in this paper that involutorial transformations exist which 
have in each plane through / a Bertini transformation with 6 of the funda- 
mental points lying on a Cs, p=3, the other two being on /, and either fixed 
or variable. There is another type in which 6 of the 8 fundamental points lie 
on a Cys, p=4, and in every plane through / there is a degenerate Bertini 
transformation. A third type is discussed in which there is a net of invariant 
quartic surfaces through a Cu, p=14. The method of obtaining this last 
transformation leads also to an involutorial transformation with a net of in- 
variant surfaces of order +1 through a Csn_3 of genus 12n—19. This type 
has on each plane through / a Geiser transformation having the 7 fundamental 
points on C;5,_3. 

2. The involutorial Bertini transformation Jz on a cubic surface F3. 
The conics tangent to a cubic surface F; at two fixed points O,, O2 meet F; in 
two residual points P, P’ which are conjugate points of an involutorial Ber- 
tini transformation Jz on F;. The web of quadrics tangent to F; at O:, O2 
meet F; in a web of sextic curves of genus 2 which is invariant under J, as is 
also the pencil of plane sections through the line /:0,+02. If the space (y) of 
F; is transformed into a space (z) by means of the web of cubic surfaces 
through a fixed Cs, p=3, on F;, then F; is transformed into a plane meeting 
the fundamental sextic of the transformation in 6 points Q3,---, Qs. If 
Q;, Q2 are the transforms of O;, O2, then Jz becomes a plane transformation 
of order 17, a line going into a C1;:80°. The image of each six-fold point Q; 
is a Cs:02+70? (72). The line Q,Q2 is the transform of a cubic curve on 
F; through Or. 

Analytically, if ye=0, y: =0 are the planes tangent to F; at O:=(1, 0,0, 0), 
0.=(0, 1, 0, 0), the equation of F; may be written 


(1) Ayr + By2 +C = yila*yiye + + yo(b?y192 + B) + + 6 = O, 


where a, 8, y, 6 are binary forms in v3, y,. The transformation J is defined by 


(2) Ay! = Byz, = = Ya, = 
* Presented to the Society, December 27, 1933; received by the editors November 20, 1933. 
292 


i 


THE BERTINI TRANSFORMATION IN SPACE 293 


The images of O,, O2 are the sextics in which the quadrics A =0, B=0 meet 
F;. Since the second polars of O,, O. differ from A, B by terms containing 
2, Vi respectively, the quadrics have second-order contact with F; at O1, Oz 
respectively and meet F; in sextics having triple points at O,, O2 respectively. 

3. The transformation J, for a pencil of cubic surfaces. Since a point P 
determines an F; of the pencil (parameter \) on which P’ can be found by 
the method of §2, we can define involutorial space transformations by either 
taking O,, O2 as fixed points lying on each F; of the pencil, or by taking one 
or both of them variable (the coordinates being functions of \), on a rational 
curve lying on each F;. 

4. Case I, O,, O2 are fixed. For this case we take a pencil of cubic surfaces 
F; having in common a Cy, p=10, through 0,=(1, 0, 0, 0), O.=(0, 1, 0, 0), 
and write the equation of the F; in the form 


Fy = Fi — 
= xe + + cxf + dxyxe + exe? + fxr + gre t+ h =0, 


(3) 


where a=a’—)a’’, etc., andc’, c’’, etc., are binary forms in x3, x4. A change of 
coordinate system given by 


(4) = bay +e, Yo = +6, Ya = % 
will express F; in the form (1). The transformation (2) in terms of x; is 

xi = (axeB + cB — eA)Ba, 
(5) = (bx,A — cB + eA)Ab, 

xg = x3ABab, x{ = x,ABab. 
The surface of invariant points K =y,A —y2B=0 contains ab as a factor as 
does the transformation (5). The forms A, B are of degree 4 in \ and of de- 
gree 2 in x;, so that the x/ are of degree 5 in x; and of degree 8 in X. If X is 
replaced by Fi /F’ we have an J in which the image of O, is A =0, the image 


of O, is B=0, and the image of C, can be obtained by applying the transfor- 
mation to an Seg. The table of characteristics of the J29 is 


O:1~ Fu: Or +02 +Cs, 
O:~ Fu: Or +0: 
Co~ Fee: +02 +Co, 

Kis: +0: +C>. 


| 
a 
} 
¥ 
| 
4 
; 
Th 
a 


294 F. R. SHARPE AND L. A. DYE [April 


Every plane through the line /:0,+-O, cuts from the Fs. a composite curve 
of order 56, the 7 components of which are the images of the 7 residual inter- 
sections of Cy with the plane. If O; (¢=3, - - - ,9) is any one of these 7 points 
and 60; (j=3, - - - , 9) are the others, then 


O; ~ Cs: + O2 + O; + 60; (i,j 


In each of these planes there is a transformation of the Bertini type of order 
29. If it is transformed by a quadratic transformation having O,, O2, O; for 
fundamental points it becomes the usual Bertini transformation of order 17 
with 8 six-fold points at O,, O2, 60;. 

Since Cy is of genus 10 there are 11 trisecants of the Cy which pass through 
O; or O2.Any one of these 22 lines meets an S29 in 14+2-8=30 points and 
therefore lies on the S29. These lines are the fundamental lines of the second 
species in the J29. The surface Rw of trisecants of Cy contains Cy as an 11-fold 
curve. The line / meets Ry in 20 points not on Cy from which trisecants of Cy 
may be drawn. In any one of the 20 planes determined by one of these trise- 
cants and /, the 6 residual intersections of Cy, lie on a conic. Each of these 20 
conics meets an Szy in 2:14+4-8=60 points and therefore lies doubly on the 
S29. They are the fundamental conics of the second species in the J29. The tan- 
gent planes to the pencil of cubic surfaces at O,; form a pencil of planes 
through the tangent line to Cy, at O,. The plane of the pencil which passes 
through O, cuts from the corresponding F; a cubic curve with a double 
point at O; and through O, and the 6 residual intersections of Cy with the 
plane. There is another such cubic curve with the roles of O,, O2 interchanged. 
Each of these cubics meets an S2» in 28+14+6-8 =90 points and hence lies 
triply on the S29. They are the fundamental cubics of the second species in 
the J. There exist then 22 lines, 20 conics, and 2 cubics which are parasitic 
curves in the involutorial transformation J». 

If the Cy is composed of a space cubic C; through O,, O2 and a Cs, p=3, 
[Cs, Ce]=8, the surface Fs, breaks up into an F,:0;4+0.4+C?+C?, the 
image of C;, and an F4s:O74 +O74 +C}? +C?*, the image of Cs. If we trans- 
form the space (x) into a space (z) by means of the cubic transformation 
T3,3:Ce the pencil of F;’s becomes a pencil of planes through the line /’ which 
is the transform of Cs. In each plane through /’ there is a Bertini transforma- 
tion of order 17. The transform of the surface Fs of trisecants of C, is C¢ , and 
to O,, O2 correspond the points (1, 02. The characteristics of the J29 in the (z) 
space are 


0: +0. +2 +8, 
+0 


THE BERTINI TRANSFORMATION IN SPACE 


Co~ Fea: Or +02 +1" +60", 
Si~ S29: +Q2 +1 


712 


+Ce, 


In the (x) space in place of the surface Re:C;!! of trisecants of Cy we have 
the surface Rs:C? of trisecants of Cs, the surface Rj :C;4+C, of bisecants of 
C3 which meet C,, and the surface Rog:C3’+C,’ of bisecants of Cs which meet 
C;. The 22 parasitic lines in the (x) space are (a) the 4 bisecants of C; through 
O; which meet C,, (b) the 4 bisecants of C; through O. which meet Ce, (c) the 
7 bisecants of C, through Oy, (d) the 7 bisecants of Cs through Oz. The lines 
of types (a), (b) correspond to parasitic conics in the (z) space through Q, or 
Q2 and meeting Cf in 5 points. The lines of types (c), (d) correspond to para- 
sitic lines which are bisecants of Cj from Q; or Q2. The 8 trisecants of Cé 
meeting /’ are parasitic and correspond in the (x) space to the 8 points 
[Cs, Ce]. 

The line / meets the surface Rs:C¢ in 8 points, hence 8 trisecants of C. 
meet /. Each of the planes determined by / and one of these trisecants meets 
the F; containing the trisecant in a residual conic which is parasitic. The 8 
conics go into parasitic cubics in the (z) space which have double points on 
Cé and pass through and 5 points on The surface Reg:C3’ +C,’ is 
met by / in 12 points, hence 12 bisecants of Cs meet C; and /. In each of the 
12 planes determined by these lines and / there is a parasitic conic which cor- 
responds to a parasitic conic in the (z) space through Q:, Q2, and 4 points of 
Cg. The two parasitic cubics with double points at O, or O2 and through O, 
or O, and 6 points of C, correspond to similar cubics in the (z) space. The 9 
in the (z) space has 22 lines, 20 conics, and 10 cubics which are fundamental 
curves of the second species. 

5. Case II, O; is variable on a space cubic curve C;. We take a pencil of 
cubic surfaces (parameter \) through a space cubic curve C; containing the 
points O,=(1, d*, A, A), O2=(0, 1, 0, 0), and having the equation 

(px) x1 4% 
(6) Fs=| (qx) x xs | =Fs — M3’ = (px)H, + (qx)H, + (rx)H, = 0, 
(rx) x3 
where (px) = pitit puts, pi=pi —dpi', etc., and H,, H,, H, are 
quadrics through C;. A point P(x) determines an F; of the pencil and a defi- 


nite point O, so that by the construction of §2 we can determine a point 
P’(x’). A change of variables is made by 


1934] 295 
a 
i 
a 


F. R. SHARPE AND L. A. DYE 


Vi = pox, — 
= p(x2 — + — g(xs — + 
AX, 


AX, 


where p=AP—OQ, gq=dA0—R, pa, etc. The planes =0, 
y2=0 are the tangent planes at O2, O; respectively, and the planes y;=0, 
ys=0 are a pair of planes through the line /:0,+0O:2. The pencil of cubic sur- 
faces is now of the form (1) and the transformation (2) gives Jz. 

The equations of the surfaces A =0, B=0, K =0 may be written in terms 
of x; and ¥; as follows: 


mp|p(PH, + QH, + RH,) + (ys — dys) {dg(px) — M(gx) + p(rz)}] 
— yalm(ys — — Me + pre) + pys(Op2 — Paz)] = 0, 
mp|m(psH » + + — yal qe(px) — po(qx)}] 
— yilm(ys — — + pre) + pys(Qp2 — Pq2)] = 0, 

= yi[p(PH, + QH, + RH,) + (ys — dye) {dg(px) — M(gx) + p(rz)}] 
— y2[m(p2H» + + — ya{qa(px) — p2(rx)}] = 0, 


where m=)p2—g2, M=dp+q. The surfaces A=0, B=0 are of the second 
degree in x; and of degrees 16, 10 in \ respectively. The surface K =0 is of the 
third degree in x; and of degree 9 in A. 

We transform the space (x) into a space (zs) as was done in the latter part 
of §4. A surface of the web in the (x) space goes into a surface of the web in 
the (z) space such that in any plane through /’ there is a Bertini transforma- 
tion of order 17. By this J the image of I’ is a C;:01+Q2+60? which with J’ 
makes up a C,:8Q? that is the plane section of a sextic surface having Qi, Qe, 
Cé as double elements. This sextic surface is the transform of a quadric sur- 
face through C; and tangent to F; at O:, O2. This quadric which is the image 
of C; and is of the sixth degree in \ has the equation 


+ pq2H + (Apge — + Qq2)H, = 0. 


Any plane through /’ is invariant under Jz, hence a pencil of surfaces of 
the web in the (z) space is made up of the pencil of planes through /’ together 
with the image surfaces of 1’, Q:, Qe. Since the image of Q; by the Jy in a 
plane through /’ is a C.:0,2 +Q2?+60? and since A =0 is of order 16 in X, 
the image of Q; by the J, in the (z) space is a surface of order 6+16=22 on 
which /’ is a 16-fold line with 3 sheets of the surface having contact along 1’. 
The point Q; is 16+3=19-fold; Q2 is a 16+2=18-fold point; Cj is a double 
curve. In the same way we obtain the surfaces corresponding to Q, and /’, and 


296 [April 
A 
B 
K 


1934] THE BERTINI TRANSFORMATION IN SPACE 


the invariant surface K. The table of characteristics of the Js: is 


og, 
Qi~ Fu +07 +07 +09, 
+ 


where the coefficient of ¢ indicates the number of fixed tangent planes at a 
point of J’. 

In determining the number of parasitic lines, conics, and cubics the meth- 
ods in the previous section have to be changed when the variable point Q; 
is involved. There are 7 bisecants of C/ through Q2 and 8 trisecants of Cy 
meeting /’ which are parasitic. In any plane \ through /’ the 15 bisecants of 
Cé meet J’ in 15 points yw; through any point yu on /’ the 7 bisecants of Cé 
determine 7 planes \ through J’. The number of coincidences in the (A, u) 
correspondence is 15+-7 =22, and hence in 22 positions of Q; a bisecant of 
Cé can be drawn from Q, in the plane through /’ associated with Q,. These 
bisecants are parasitic lines. 

There are 4 conics through Q2 and 5 points of C/ in planes through /’ 
which are parasitic. In any plane \ through /’ the 6 conics through 5 points 
of C{ meet J’ in 12 points uw; through any point yu on /’ the 4 conics through 5 
points of Cj lie in 4 planes \ through /’. There are 12+-4 = 16 coincidences in 
this (A, u) correspondence and therefore 16 positions of Q; such that Q; and 
5 points of Cf lie on a conic in the plane through /’ associated with Q,. In 
any plane \ through /’ the 15 conics through Q, and 4 points of C{ meet J’ 
in 15 points w; through any point uv on J’ the 12 conics through Q2 and 4 points 
of Cé lie in 12 planes \ through J’. The 15+-12 =27 coincidences of this (A, ») 
correspondence determine 27 positions of Q; such that Q;, Qe, and 4 points of 
Cé lie on a conic in the plane through /’ associated with Q;. These 47 conics 
are all parasitic. 

There are 2 values of \ given by m=0 for which the tangent plane to F; 
at O; contains Oy, and there are 5 values of \ given by p=0 for which the tan- 
gent plane to F; at O; contains O». In each of these 7 planes there is a cubic 
with a double point at O, or O,; and passing through O, or O2 and 6 points of 
Cs. These 7 cubics correspond to 7 similar cubics in the (z) space. In any 
plane \ through /’ there are 6 cubics with a double point on Cj and through 
Q. and the 5 remaining points of Cj. These 6 cubics meet /’ in 12 points yp. 


297 at 
| 
ay i 
i, 
4 
if 
4 
aa 


298 F. R. SHARPE AND L. A: DYE [April 


Through any point yu on /’ there are 8 cubics with a double point on C/ and 
through Q and the other 5 points of C; . These cubics lie in 8 planes \ through 
l’ so that there are 12+8 =20 coincidences in the (A, uw) correspondence and 
20 positions of Q; such that Q:, Q2, and the 6 points of Cf lie on a cubic with 
a double point at one of these latter points. These cubics lie in planes 
through /’ associated with the positions of Q,. There are then 37 lines, 47 con- 
ics, and 27 cubics which are fundamental curves of the second species in the 
Ts. 

6. Case III, O;, O2 are both variable on a space cubic C;. To illustrate 
the case where the points O,, O; are variable on a rational curve which is part 
of the basis curve of a pencil of cubic surfaces we again utilize a rational 
space cubic. Other rational curves might be considered and other arrange- 
ments of the points O,, O. might be used, but the transformations obtained 
resemble the J;; in Case II and the J;; derived in the following case. 

The points O,=(1, u?, uw), O2=(1, —u), where \ lie on the 
C; which with C, makes up the basis of the pencil of cubic surfaces F; given 
by (6). A change of coordinate system is made by 


Yi = + + — G(x2 + + 


Yo = p(x2 — 2wxs + — — + 


Ys = x2 — 
Ya = — 
where p=uP—Q, etc., and the dashed 
letters indicate a change of sign in uw. The surface F; is now in the form (1) 
and the involutorial transformation (2) is determined. We have the following 
expressions for A, B, K written in terms of x; and y; for the sake of concise- 
ness: 
A =4y°M —ug(px) +M (qx) — p(r2)} 
+yalys{ 
+uya| + Rp)} ], 
[M(PH,+0H,+R4,) —(yst+uys) { ug(px)+M (qx) —p(rx)} 
+ Rp) } 
+uys{ —M(uPq—QM+Rp)+ + Rp)} |, 
{ —ng(px)+M (qx) — p(rz)} ] 
— {ug(px) (qx) —p(rz)}], 
where M =yup+gq. 


1934] THE BERTINI TRANSFORMATION IN SPACE 299 


The surfaces A =0, B=0 are of order 2 in x; and of order 13 in uw? after the 
removal of a factor u?. The invariant surface K =0 is of order 9 in yw? and of 
order 3 in x;. The image of C; which is the quadric which contains C; and is 
tangent to F; at O,, O: is of order 6 in yw? and has the equation 


(pM — pM)H, + + + u(QM + gM)H, = 0. 


The table of characteristics of the Jz in the (z) space may be obtained as 
in Case II and with the same results except that the images of the points 
Q:, Qe combine and the joint image is 


Q2) ~ Fs: 1 +Cs + 


In any plane \ through /’ the 15 bisecants of C/ meet /’ in 15 points y; 
through any point yu on /’ the 7 bisecants of C{ determine 7 planes \ through 
1’. In the correspondence (A, w) there are 15+7+7=29 coincidences since 
\=y?, and hence in 29 positions of the pair of points Q;, Q2 a bisecant of Cf 
can be drawn from one of them in the plane through /’ associated with the 
pair. There are 8 trisecants of C/ which meet 1’. These 37 lines are parasitic 
in J 

In any plane \ through /’ the 6 conics through 5 points of Cj meet /’ in 
12 points u; through any point yu on /’ the 4 conics through 5 points of Cf 
lie in 4 planes \ through /’. The number of coincidences in the (A, u) corre- 
spondence is 12+4+4=20 and hence in 20 positions of the pair of points 
Q:, Q2, one of the pair and 5 points of C¢ lie on a conic in the plane through 
l’ associated with the pair. In any plane \ there is a pencil of conics through 
each of the 15 sets of 4 of the 6 points of C;. Each pencil determines an in- 
volution on /’ which has one pair in common with the involution of points 
u, hence 15 pairs of points uw? are determined. Given any pair of points yu? on 
I’ there are 12 planes \ through /’ in which there are conics through the pair 
nu? and 4 points of C{. In the correspondence (A, uw?) the 15+12=27 coinci- 
dences fix 27 positions of the pair Q:, Q2 such that conics in the associated 
planes pass through them and 4 of the points of C;. 

The 7 values of \ given by MM =0 determine 7 planes tangent to F; at 
O; or O2 which pass through O, or O,. From the associated F; each of these 
planes cuts a cubic with a double point at O, or O, and passing through O; or 
O; and 6 points of Cs. These 7 cubics correspond to similar cubics in the (z) 
space which are parasitic in the Js. In any plane \ through /’ there are 6 
pencils of cubics through the 6 points of Cj and with a double point at one 
of them. Each pencil determines an involution of the third order on 1’ which 
has 2 pairs in common with the involution of points u, hence to a \ correspond 
12 pairs of points uw”. Given any pair of points uv? on J’ there are 8 planes 


126+5t 


= 


| 
| 
| 
| 


300 F. R. SHARPE AND L. A..DYE [April 


through /’ in which there are cubics through the pair yu? and 6 points of Cé 
and which have a double point at one of the points of Cj. The correspondence 
(A, wu?) has 12+8 =20 coincidences which determine 20 positions of the pair 
Q;, Q2 such that in the associated plane there will be a cubic through Q,, Q2 
and the 6 points of Cj and having a double point at one of the points of Cy. 
Hence as in Case II we have 37 lines, 47 conics, and 27 cubics which are fun- 
damental curves of the second species in the Js. 

7. A Bertini transformation on a cubic variety in S,. In a space of four 
dimensions we take a cubic variety V; with a double point at O; = (0, 0, 0, 0, 1) 
and through the points O,=(1, 0, 0, 0, 0), O.=(0, 1, 0, 0, 0). The equation of 
the variety is 

Vs = + $3 = 0, 


where ¢2, ¢3 are quaternary forms in x1, X2, 3, x, with the x , x.° terms missing 
in ¢3. The conics tangent to V; at the points O,, O2 meet V; in two residual 
points P, P’ which are conjugate points in a Bertini involution Jz on V3. 
This involution can be mapped on the 3-space x; =0, and a Bertini involution 
Tz in 3-space is thus determined. The hyperplane x; =0 meets V; in the cubic 
surface ¢;=0, and meets the tangent hypercone to V; at O; in the quadric 
¢2=0. The surfaces ¢2=0, ¢;=0 meet in a sextic curve C, of genus 4. Any 
plane 7 through the line 0,0, meets C; in 6 points R which lie on a conic. The 
hyperplanes through O,, O, are invariant under Jz, and the planes = are in- 
variant under J. Since the 6 points R lie on a conic in each plane 7, the Ber- 
tini involution in such a plane is degenerate and of the form J,3;:0,°+0;° 
+6R‘, with an invariant curve k;:0; +0; +6R?. The Jz in the space x; =0 
has the characteristics 


O1~ Fe: O01 +0: +Cs, 
O:~ Fe: 0; +0: 
12 12 7 

S:~Su: O. +0, 


The 6 bisecants of C, from O, and the 6 from O; are parasitic lines in Jz 
and correspond to lines on the V; through O; or O2. To determine the number 
of parasitic conics we must find the number of conics which lie on V; and pass 
through O, and Oz, since in any such conic the construction used to determine 
J will fail in the sense that to a point on the conic corresponds the whole 
conic. By a proper choice of coordinate system we can write the equation of 
any cubic variety in the form 


THE BERTINI TRANSFORMATION IN SPACE 


(7) + + + + crite +d = 0 


where a, b, c, d are ternary forms in 2s, %4, xs. The left hand member of (7) 
can be factored as follows: 


(xixe + b)(x1 + x2 + 0), 
i 
(8) a—b=0 and ac—d=0. 


Equations (8) represent two hypercones of the second and third orders re- 
spectively whose rulings are planes. The 6 planes common to the two hyper- 
cones cut conics from the cubic variety through the points O,, O2. Hence there 
are 6 fundamental conics of the second species in the J; besides the 12 lines 
of the second species. 

8. A family of space Bertini transformations. A net of planes =)i%1 
+)av2+Asx3 =0 through the point (0, 0, 0, 1) and a net of cubic surfaces 


(9) = x4(ax)? + x1x2(bx) = MF 3 + = 0, 


where (ax)? and (bx) are quaternary forms in %1, x2, %3, 4, and 


= + + Asay; , and b; = + + Addi’, 


through the lines , =x.=x,=0 and ,.=x,=x,=0 may be used to determine 
a transformation of the Bertini type. A point P(x) determines a set of \; and 
hence a plane z and a surface F;. The plane x cuts the lines i, /, in a pair of 
points Oi(As, 0, —A1, 0), O2(0, As, —As, 0). The conic through P(x) and tan- 
gent to F; at O, and O, will meet F; in a residual point P’(x’) which is the 
conjugate of P(x) in an involutorial transformation J. 
If we make the linear transformation 

Yi = AsBox1 + 

y2 = + Aim, 

ys = Aix + + Asxs, 


Ya = 


By, = bids — 
Bz = beds — 
A,= auds — + 
Ae — + 


then equation (9) is in the form of (1), and the transformation (2) may be 
used to obtain 


1934] 301 
A 
where 


F. R. SHARPE AND L. A. DYE 


= B,B(By: — AsAy), 
= BA(Ay: — AiBy,), 
— B,B(Byz — ArAys) — BeA(Ayi — 


= Biyiys + + — 

+ + — 2arsds — — bids) |, 
= + + — 

+ + — — — bids) 


The \,; are now replaced by 
Ai = o1 = — 3’, 
he = = — 
As = = — 
The quartic surfaces ¢; =0 have in common the lines h, /2, and a residual curve 
Cy of order 11 and genus 14. The surfaces A =0, B=0 which are the images 
of the lines ),, /, are of order 8 in ¢; and 2 in x; after a factor ¢? is removed. 
The factor ¢;°B,B, can be removed from the transformation, and the invari- 
ant surface K=y,A —y2B=0 has the factor ¢# B,B:. The characteristics of 
the transformation are 
~ Fe : + + 
le ~ + + Cn 
Cu Faoa? + I + 
+h +Cu, 
Keith +h +Cn. 

The x parasitic lines of J are trisecants of Cu which meet either /, or 4. 
Since Cu meets /; in 4 points there are 7 residual intersections R; in a plane 
through /,. In any such plane a line R;R; meets /, in a point P, and through 
each of R; and R; pass 5 other bisecants of Cy meeting /; in 10 points Q. If 
h’ is the number of bisecants of Cy, through any point of /,, the points P, Q 
are in (10h’, 10h’) correspondence. The 20h’ coincidences are determined by 


the x trisecants of Cu meeting /,, the r’ tangents of Cy, meeting /,, and the 4 
tangents to Cy, where it meets /;. Hence 


20h’ = 6x + Sr’ + 30-4. 
Since the Cu is of class r=48 and has 4=31 apparent double points, then 


302 (April 
where 
A 
B 


1934] THE BERTINI TRANSFORMATION IN SPACE 303 


h’ =h—4-3/2=25, andr’ =r—2-4=40. These values make x = 30, but among 
the 30 trisecants the line /,, which is a quadrisecant, is counted 4 times. Hence 
there are 26 trisecants of Cy, meeting /, and 26 more which meet /,. These 52 
lines are the parasitic lines of the transformation /. 

Let y be the number of parasitic conics and z be the number of parasitic 
cubics of 7. The complete intersection of two surfaces of the web of So is 
made up of 


692 = 69 + 22? + 22? + 11-16? + 52 + 8y + 272, 
and the complete intersection of an Sg and the Kz; is made up of 
69-27 = 27 + 9-22 + 9-22 + 6-16-11 + 52 + 4y + 92. 


The solution of these equations is y=45, z=18, whence we can conclude that 
the fundamental curves of the second species in J consist cf 52 lines, 45 con- 
ics, and 18 cubics. 

9. A family of space Geiser transformations. If F,=0 is a surface of or- 
der m with an (n—2)-fold line /=x;=x,=0, the equations 


(10) r= + A2Xe + X3X3 = 0, 


(11) = ax; + + + + +c = + + = 0, 


where a=),a’+)2a’’+);a’"’, etc., and a’, a’, a’’’, etc., are binary forms in xs, 
x4, define a net of plane curves C, of order m with an (m—2)-fold point 
Q=(Az, —du, 0, 0). A line through Q and a point P(x) on C,, meets it in a re- 
sidual point P’(x’), thus defining an involutorial transformation J having the 
invariant net of surfaces 


kidi + kobe + 
= — + — + — = 0. 


The pencil of planes p=x,—yx;=0 through / are invariant under J and in 
any such plane F,, takes the form 


(12) ax, + bxs + + + 2fxexs + 2gxix3 = O, 


where the coefficients are polynomials in 4. This net of conics enables us to 
map J on a double space S(Ai:A2:As, «). A plane 


(13) M1X1 + + + Mex, = O 


is mapped on S by eliminating x; between (10) and (13) and using x,=yzs. 
The values of x; thus obtained are substituted in (12) giving 


i 
2 
4 
“hy 
ot 
2 
at 
4 
aq 


& 

ah 

q 

i 
| 

4 

? 

| 

it 

a 

$ 


304 F. R. SHARPE AND L. A, DYE 
a(mod3 — Msd2)? + — mdz)? + — modi)? 
+ 2h(mo2dr3 — Misd2)(Msr1 — mids) + 2f(Misdri — — 
+ — — m2d1) = 0, where = ms + um, 
which must be identical with 
(mix, + + m3x3 + m4X4) (my xi + + + = 0. 
From this identity we have 
= — dads + Cds, 
= ad; — + Dds, 
xg 


If we replace u by x,/x3 and X; by ¢; we have the transformation J in the form 


= bbs — + cdo, 
2 2 
= Chi — + ads, 
xi = xs(ab2 — + bea), 
x = — 2hdid2 + bés), 


where 21, x2 are factors of the first two equations respectively. The surfaces 
¢;=0 are of order n+1 and have / as an (n—2)-fold line. The residual basis 
curve of the net of ¢; is a Cs,_s of order 5m —3 and genus 12m —19 through the 
point (0,0, 0, 1). The image of / in J is the surface L =a? —2hdid2+bo? =0, 
which is of order 3 in ¢; and of order n—2 in x3, xs. The image in S of the in- 
variant surface K has the equation 


Ar Ae As 


which corresponds to K? in the space (x). Hence X is of order 2 in ¢; and of 
order m —1 in x3, x4. The table of characteristics of J is 


12n—18 
Si Sango: + Coss 
5 


a h g 1 
= 
g f ¢ 


1934] THE BERTINI TRANSFORMATION IN SPACE 305 


In any plane p through / there is an ordinary Geiser transformation, 
therefore the C;,-3 meets such a plane in the 7 fundamental points R; of the 
Geiser transformation and in 5”—10 points on /. The section of Csn_s by the 
plane x;=0 is the point (0, 0, 0, 1) and 6 points lying on the conic x;=0, 
F,{"’ =0. Hence on this plane the Geiser transformation degenerates and the 
conic is parasitic for J. 

The x parasitic lines of J are trisecants of C;,-3 meeting /. Since Csn_s 
meets any plane # in 7 points not on / the method of §8 may be used in deter- 
mining the number of trisecants of Cs,-3 which meet /. The number x = 15” —15 
is obtained from the equation 


20h’ = 6x + Sr’ + 30(5n — 10), 


where r’ =24n—26, and h’=18n—26. Therefore the fundamental curves of 
the second species for J consist of 15m —15 lines and one conic. 


CorNELL UNIVERSITY, 
Irmaca, N. Y. 


iM 
+ 
4 


x 


AT 
me 
f 
| 
4 


ON THE PROBLEM OF n BODIES* 


BY 
J. J. L. HINRICHSEN 


Introduction. For the problem of three bodies, Sundmanf established to- 
gether with other results that if the angular momentum of the three bodies is 
not zero about every axis through the center of gravity of the system, the 
greatest of the three mutual distances will always exceed a specifiable con- 
stant depending upon the initial configuration of the bodies, and hence that 
triple collision is impossible. The problem was then considered from a differ- 
ent point of view by Birkhoff{ in his Chicago Colloquium lectures of 1920. 
He considered the case for which the angular momentum of the three bodies 
about every axis through the center of gravity of the system is not zero and 
for which the constant K appearing in the energy integral: T=U—K, is 
(1) equal to or less than zero, and (2) greater than zero. Here T denotes the 
kinetic energy and — U denotes the potential energy of the system. He showed 
for the first case that at least two if not all three of the mutual distances 
increase indefinitely as the time increases and decreases. For the second case, 
he showed if the motion of the three bodies is such that for some instant all 
three bodies approach sufficiently near to one another, that two of the mutual 
distances become infinite with the time while the third mutual distance re- 
mains less than a definite constant depending only upon the energy constant 
and the total mass of the system. After stating and proving various other 
results, he concluded by stating without formal proof that the results 
described above may be extended to the case of bodies attracting one an- 
other according to the Newtonian law of force as well as to the case of m bodies 
attracting one another according to a more general law of force. The present 
paper has as its object the investigation of the conditions under which these 
extensions apply. 

The equations of motion and other fundamental relationships. We shall 
denote the m bodies (assumed to be particles) by P; (¢=1, 2,---,m), and 
suppose them to have positive finite masses m, and real coordinates (x;, y;, 2;). 
The distance from P; to P; will be denoted by r;;. We shall suppose that the 
bodies attract one another in such a way that there exists a potential function 

* Presented to the Society, December 30, 1929; received by the editors July 19, 1932, and, in 
revised form, July 24, 1933. 

+ Sundman, Mémoire sur le probléme des trois corps, Acta Mathematica, vol. 36 (1913), p. 105. 


t Birkhoff, Dynamical Systems, 1927, p. 260. This book is volume [IX of the American Mathe- 
matical Society Colloquium Publications. 


306 


ON THE PROBLEM OF n BODIES 307 


U = ixj,0<d<2. 
If d=1, this function reduces to that for the Newtonian law of attraction. 
Inasmuch as the probability of collision among particles moving according 
to this law is zero in the general case, we shall assume that none of the n 
bodies ever collide. Then all of the r,; will always be positive. 
If ¢ denotes the time, the equations of motion will be 


d*y; 


mM; 
dt? ; dt? 


The ordinary existence theorems for a system of differential equations may 
be applied to yield the result that for assigned valucs of the coordinates and 
velocity components for ‘= where t9<i<t, there exists a unique set of 
analytic functions 2:(t), x/ (0), (), defined and satisfying 
the system of equations for 4)<<¢, and taking on the assigned values for 
t=. Furthermore, since we assume that the distances r;; are always positive, 
the interval of definition may be extended to the interval — © <i<o, 
The equations of motion admit the following ten integrals: 


+ yf? + = 2(U — K), 

= my! = Domai = 0, 

— ay!) = 

— x2!) = cr, 

yixi ) = 
where the summations for 7 are to be taken from 1 to m. Here K, ¢1, C2, cs are 
constants of integration and the primes denote derivatives with respect to ¢. 
The coordinate system has been so chosen that the center of gravity of the 


system is fixed at the origin. 
If we define 


RX) -( \em), 


t,j=1 
where M represents the total mass of the system, it is not difficult to obtain 
the analogue of Lagrange’s Identity*: 


(R2)” = 2(2 — d)U — 4K. 
We shall suppose 0 <d <2 in order that the coefficient of U may be positive. 
* Lagrange, Essai sur le probléme des trois corps, Oeuvres, vol. 6, p. 240. 


n 
By. 
aU dz, aU 
Oy; di? 02; 
é 
“ag 
ay 


308 J. J. L. HINRICHSEN - [April 


We shall now proceed to derive the analogue of Sundman’s Identity* for 
the problem of ” bodies. Let us choose the coordinate axes in such a way that 
the products of inertia of the ” bodies vanish, and if the moments of inertia 
about the x, y, z axes are A, B, C respectively, that A> B2C. We propose to 
find a minimum for the kinetic energy of the system 


= + + 2/2), 


when the 3m space coordinates are fixed and the 3m velocity components are 
allowed to vary except for being required to satisfy the integrals of angular 
momentum and 


RR’ = + + = ca. 
By the Lagrange method of multiplierst the minimum value of T under 
these conditions is found to be 

c# cé ) 

+ — +—+—)2=-(— + R")}, 

A 


where f? =c? +c? +c?. On applying the energy integral, we may express this 
result by writing 


R?+ P=2(U—K) where P 2 f?/R?. 


Let us now eliminate U between the above two fundamental identities. 


If we define 
F = 2RR" + dR + 2dK — (2 — d)f?/R?, 


the relationship obtained will show that F 20. Let us define 
H = + 2K + 


and differentiate with respect to ¢. In terms of F, we obtain H’=FR*-'-R’, 
from which we have the following result: If R increases, H cannot decrease, 
and if R decreases, H cannot increase. We furthermore note if f>0, R cannot 
approach zero, since then, by its definition, H would become infinite. 

By means of the six integrals of linear momentum, the system of equa- 
tions of motion may be reduced to a system of order 6n—6. We shall carry 
out this reduction in the following manner. For any instant, consider first 
all possible ways of dividing the m bodies into two groups Gi, G2, and choose 
that one for which at the given instant the distance from the center of gravity 
of one group to that of the complementary group is greatest. There may be 


* Sundman, loc. cit., p. 148. 
t See for example Goursat, Cours d’Analyse Mathématique, 1923, vol. 1, p. 119. 


1934] ON THE PROBLEM OF n BODIES 309 


more than one such method of subdivision giving this maximum distance, 
in which case we shall divide the ” bodies into two groups in any one of the 
several possible ways. Let the coordinates of the center of gravity of one 
group, Gz, with respect to the center of gravity of the complementary group, 
Gi, as origin be (£1, m, £1) and define p? = £2 +n2+¢?. 

If either G, or Gz contains more than one body, consider the various 
possible ways of subdividing G,; and G, into subgroups, and choose a method 
of subdividing one group to give the greatest possible distance from the 
center of gravity of one subgroup to that of the complementary subgroup. 
Let the coordinates of the center of gravity of one subgroup with respect 
to the center of gravity of the complementary subgroup as origin be (£2, m2, £2) 
and define p? = £?+7?+¢?. This process of subdivision may be repeated 
until each of the final groups contains only one body. When this stage has 
been reached, »—1 sets of coordinates (£;, ;, ¢;) will have been introduced 
together with n—1 distances defined by p? =£? +n? +¢?. 

The equations of transformation from (x;, ;, 2;), i=1, 2,---, , to 
ni, £1), 7=1, 2, -- +, m—1, will depend upon the distribution of the 
bodies with respect to one another and hence will in general depend upon ¢. 
If the position of each of the bodies at a given instant is known, there will 
always exist at least one way of separating the m bodies into groups in the 
manner described above, and then the equations of transformation together 
with their inverse formulas may be written down. If the system of m bodies is 
divided into groups in a proper manner, the same equations will apply 
throughout some interval of time containing the given instant. As / increases 
or decreases, the intervals of time throughout which a particular grouping 
obtains may become smaller and smaller and approach zero as a limit. Since 
we exclude the possibility of collision, the velocities of the bodies are bounded 
and in a sufficiently small interval of time the position of the bodies can 
change by only a small amount. To make it possible to continue the trans- 
formation beyond a limit point of grouping intervals, we shall modify the 
above method of dividing the bodies into groups in a sufficiently restricted 
neighborhood of the limit point so as to preserve a constant grouping there. 
Then by setting up a finite number of sets of equations of transformation, we 
may for any given finite interval of time express the equations of motion to- 
gether with the energy integral and the integrals of angular momentum in 
terms of the new variables (£;, 7;, ¢;). 

Throughout any interval of time t’ <t<t#’’ for which p,(#) represents the 
distance between the centers of gravity of the same two fixed groups of bodies, 
p;(t) will be analytic. For an instant at which the grouping changes, some 
distance p; must change to the distance between two new centers of gravity. 


ur 


| 
q 

¥ 

tf 

in 

a 

4 
4 
{ 


310 J. J. L. HINRICHSEN’ [April 


In this case all of the following p; will in general also change to distances be- 
tween new centers of gravity. Except for intervals of time containing limit 
points of grouping intervals, the p; with the smallest subscript which changes 
will be continuous but will in general have a discontinuous first derivative. 
For such intervals of time as contain a limit point of grouping intervals the 
p; with the smallest subscript which changes will itself have a break whose 
magnitude may be made arbitrarily small by taking the interval about the 
limit point small enough. The p; with larger subscripts will in general be dis- 
continuous in either case. 

Suppose for ¢’<t<#’’, the group of bodies }-P;, has its center of gravity 
at the (£;, ;, ¢;)-origin and the group > P;, has its center of gravity at the 
point whose coordinates are (£;, ;, ¢;). If we define j 


( 
(Sm) + (Sm) 


the reduced system of equations of order 6n—6 will assume the simple form 


= 1,2,---,#—1), 


If we denote the derivatives of n;, with respect to ¢ by nj, the 
energy integral becomes 


+ 0}? + £7?) = — 
while the integrals of angular momentum become 
— Simi) =a, 


Finally, if the values of x;, y;, 2; in terms of &;, n;, £; are substituted in the 
expression for R?, we obtain 


Some properties of the motions. We shall now proceed to consider those 
properties of the motions of the m bodies acting under the above law of force 
which correspond to the properties considered by Sundman* and Birkhofff 
for the problem of three bodies under the Newtonian law of force. With the 
analogues of the fundamental identities of Lagrange and Sundman together 
with the (£;, nj, £;), 7=1, 2, - - - , #—1, coordinates available, the proofs of 


* Sundman, loc. cit., p. 105. 
t Birkhoff, loc. cit., p. 275. 


d%; aU d%; aU dt; av 


1934) ON THE PROBLEM OF n BODIES 311 


these theorems will be found similar to those for the classical problem. For 
this reason we shall merely state certain results and outline the proofs of 
others. 

Directly from the analogue of Sundman’s Identity we have the following: 
For the case K <0, 0<d<2, at least n—1 of the mutual distances r;; increase 
indefinitely as the time increases or decreases. We shall now restrict ourselves 
to the case of K >0. 

If K>0, the least of the mutual distances r;; cannot exceed [M?/(2K) 4. 
This result follows immediately when the definition of U is applied to the 
energy integral. 

For the case K >0, the largest r;; will necessarily exceed k times the smallest 
provided 


2K 


2 


where m denotes the least of the masses m;. Here we must apply the analogue 
of Sundman’s Identity together with inequalities obtained from R. 

For the case K >0,0<d <2, any part of the curve R= R(t) (t, R rectangular 
coordinates) for which R<f|(2—d)/(2dK) |'!? consists of a finite arc concave 
upwards and with a single minimum. If R= Ro gives this minimum, the curve 
rises on either side until R satisfies the inequality 


(R4 — Ro*)/[1 — (Ro/R)*-*] 2 f/(2KRI~), 


with a corresponding slope R’ at least as great as is demanded by the inequality 


1 1 


at every intermediate stage. This result follows from a combination of the 
analogue of Lagrange’s and the analogue of Sundman’s Identity. 

Since we are considering motions for which there are no collisions, f must 
be positive before this theorem may be applied. The bodies are all near to- 
gether at some instant ¢=t), the amount of separation being measured by R. 
The bodies separate in such a way that R increases and very rapidly as long 
as R is not too small or large until R has become very large. Since the least 
of the mutual distances is not greater than [M?/(2k) }'/¢ for all values of the 
time, at least two bodies must remain relatively near together throughout the 
entire motion. 

We shall now turn to consider the function p:(#). We shall prove the follow- 
ing theorem: 


} 
if 
| 
an 
{ 
if 


312 J. J. L. HINRICHSEN [April 


In the case K >0, throughout any interval of time p,' > —3dM4+?/(2mp,)4*". 
If for any instant pi > [3M4+?/(2¢m4*'p,4) |/2, then pi will continue to increase 
indefinitely with t. 

Let us first consider ¢ in an interval ¢’ <t<¢’’ sufficiently restricted so that 
the ” bodies preserve one and the same grouping throughout. If one group 
consists of the k bodies P;(i=1, 2,---,k; R=1, 2,---,m—1), while the 
complementary group consists of the »—k bodies P;, 7=k+1, - - - , m, then 
we can show that there exists a positive lower bound for the distances r;; in 
terms of p:, namely r;;=>2mp,/M for i=1, 2,---,k;7=k+1,---,n. 

The distance p; may be written MW /y?, where 


w= > and W?= ( + ( mise) + ( > mas). 


Upon differentiating twice with respect to ¢ and dropping three non-negative 
terms from the second member, we obtain 
pi’ = [M/(wW)][( Domix:)( Domixi’) + ( 
+ ( domz:)( 
where the summations are to be taken from i=1 to i=k. If furthermore we 


use the equations of motion to eliminate the second-order derivatives and 
simplify by applying inequalities of the type 


k 
— 4; mx; = W 


we obtain the desired inequality concerning p/’. By integrating both sides 
of this inequality, we find if for any ¢ in t’<t<t’’ the inequality involving p/ 
is satisfied, that p, will continue to increase indefinitely if the grouping of the 
n bodies does not change. For any instant that the grouping does change, 
either p/ will be continuous or it will be increased and hence if this inequality 
is satisfied for any instant, p; will continue to increase indefinitely with ¢. 

We proceed to combine these results in order to show that a motion 
having its minimum R sufficiently small is one for which R and p, increase 
indefinitely as ¢ increases or decreases. According to what has been proved, 
for R* and R*’ arbitrarily large and for any fixed d, 0<d<2, a positive Ro 
can be chosen so small that all motions for which the minimum R is not more 
than Ry correspond to an R which increases from the minimum to R* and 
has for R = R* a derivative R’ which is at least as great as R*’. 

The function p;(#) is defined throughout any finite interval of time and 
will satisfy the inequality 


ON THE PROBLEM OF n BODIES 


2R/|(m — 1)M¥/2] < py < (2M)"/2R/m, 


from which it is evident that if R increases indefinitely so also must pu, 
and conversely if p; increases indefinitely so also must R. 
Let us consider a fixed value of Rp satisfying the inequality 


(a) 0 < Ry < (2 — 


Then R must increase until 


R4/[1 — (Ro/R)?-4] = f?/(2KRP-4). 


Given any value R*, we can choose Ry so small that R becomes greater than 
R*. We shall suppose therefore that Ry has been chosen so small that in 
addition to satisfying (a) the motion is such that R increases until 
R22"@-®R,. In this case R increases from Ry until 2R¢*=f?/(2KRy?-*) or 
until R= R*=f?/4/(2?K Ry?-*)"/4, The above inequality will be satisfied if 
Ry is chosen so small that f?/¢/(2?K Ry?-4)/4> Ro, or if 


(b) Ro S 2-4) K) 1/2, 
Now let us define R** = mR*/ [2°/2(n—1)M_]. If we choose Ry so small that 
(c) Ro < — 1)¢M4K}*, 


we shall have Ryo<R**/2. If we define R***=R*/2, it is obvious that 
R** < R***, If denotes the first value of ¢ for which R(t) =Ro, and ¢* de- 
notes the first value of ¢ greater than ¢) for which R(t) = R*, there will exist 
a unique pair of values /**, ¢*** in the interval t)<¢<é* such that R(¢**) 
= R** and R(t***) = R***. Since the function R(é) is continuous and has a 
continuous derivative for ¢ in any closed interval ‘** <i<i***, we may apply 
the law of the mean for derivatives which states that there exists at least 
one point in ¢**<t<f*** such that (R***—R**)/(i***—i**) =R’ where 
R’=R'(i). Since R<R*, R’ must satisfy our previous inequality and 
(R*** — R**) /(¢*** —7¢**) > E where E denotes E with R replaced by R. 

Consider now the average rate of change of p,; throughout the interval 
i*<t<i***. We find 


= pi(t**) |/ [¢*** t**] > [R*** R**|/[(n = 1)M1/2(¢*** t**)]. 


There must exist a value of t, say h, satisfying t**<i<i***, such that 
pi (4) 2 E/[(n—1)M?]. 

We wish to show that a motion having its minimum R denoted by Ro 
small enough is one for which R and p; become infinite with ¢. This result 
will follow if E/[(m—1)M/?]> [(2?-¢M2+4) or on elimi- 
nating if 


1934] 313 

43 

if 

if 

if 

ig 

4 

i 


J. J. L. HINRICHSEN 


(m24+1f2F2) / 1)2(¢+1) @4+8)/2K R2-4] 2. 


It is obvious by the choice of R** and R***, that [Ré¢—R¢]/R¢>—1. 
Since R <R*/2, we have 1/(R4R?-¢) >27+4K/f?. Also since R>R** we have 
1/R? > — — | 

If furthermore we suppose 


m4! (2-4) f 


/ [2(2-4)] on 1)4/ K 1/2 


1 1 
=) > 241K, 


(d) Ro < 


then 


and the desired inequality will be satisfied if 


(24 — (ad+1) / (2-4) (2-4) 


(e) Ry < 


26-4) — JY / [2(2-4)] 


We have the following result: Im the case K >0, 0<d <2, if the motion is 
such that the n bodies approach so closely that the minimum R denoted by Ro 
satisfies the inequalities (a), (b), (c), (d) and (e), then at least n—1 mutual 
distances become infinite with t while at least one such distance remains less 
than [M*/(2K) 

We may also state one further property of motions of the above kind. 
Any motion for which f>0, K >0, 0<d<2 and the bodies are all near together 
at some instant t=t is characterized by the property that one ;; remains rela- 
tively large compared to the smallest r;; throughout the entire motion. This result 
follows from an earlier result, the definition of R, the energy integral and the 
analogue of Sundman’s Identity. 

The results of this paper may be extended to motions embracing instants 
of collision if any kind of continuation after multiple collision were possible 
in which the constants of linear and angular momentum as well as of energy 
are the same after as before collision and if also R’ may be regarded as con- 
tinuous at collision. In this case none of the analytic work would be affected 
even though for certain instants there did occur multiple collisions among 
the bodies. 


Iowa STATE COLLEGE, 
Ames, Iowa 


314 


THE RULED V; IN S; ASSOCIATED WITH 
A SCHLAFLI HEXAD* 


BY 
JOHN EIESLAND 


1. Introduction. In S, we know that all the lines which meet four generic 
planes also meet a fifth associated plane. These 2? lines generate a ruled 
V;, i.e., the variety of Segre with 10 double points. This property cannot 
be generalized in space of more than four dimensions; that is: All the lines 
which meet m generic (n—2)-flats will not in general meet an additional 
(n—2)-flat when n>4; generic flats will determine n+1 

If however the +1 S,_2’s form a Schlafli set, a single V;,—; is determined. 
An equation of this spread has been given by C. R. Rupp.f Let the Schlafli 
set be given as followst: 


(1) x= 0, = 0 
0 


where b;;=0, b:,=b:;. The equation of the V%—; determined by the first 
flats is then 


bor boe bos Don 


n 
a 


— > bax; beste be 
0 


bn—1,1%0 bn-1,2%n bn—1,3%n 20 


It may easily be verified that the same V%—; will be obtained by taking any 
other set of » flats from (1). All the 7+1 flats lie on the spread and the funda- 
mental (or regular) singular loci are (~ — 2r)-flats of multiplicity r=2,3,---, 
n/2, when n is even, and of multiplicity r=2, 3, - - - , (n—1)/2, when m is 
odd; there are ("*") such loci. The study of the remaining accessory singular 
loci for the case m =5 will be the object of the present paper. 


* Presented to the Society, April 4, 1931; received by the editors October 25, 1933. 

t C. R. Rupp, An extension of Pascal’s theorem, these Transactions, vol. 31, p. 578. 

t Luigi Berzolari, Sui sistemi di n+-1 rette dello spazio ad n dimensioni, situate in posizione di 
Schlaéfli, Rendiconti del Circolo Matematico di Palermo, vol. 20, pp. 229-247. 


315 


if 

i 

i 

| 
if 

0 

ia 

a 


316 JOHN EIESLAND [April 


That this V%—\* is not generic may be shown thus. By a projective trans- 
formation 
Xo = = (i= 1,2,---,m), 


a Schlafli set may be carried into the slightly modified form 
=0, Dim =0; (i =1,2,---,m), 
1 


where 6;;=0, bio=1, Ri. A general Schlafli set depends therefore 
on n(n—1)/2 parameters. Let there now be given +1 generic S,_2’s in S,: 


x; x; = 0; a; = x; 
0 0 0 0 
aye; = = 0, 
0 0 
which depend on 2(m—1) (+1) parameters. A projective transformation can 
therefore be found which will reduce this number to 


2(n — 1)(n + 1) — n(n + 2) = n? — 2n — 2. 


But this number is greater than n(m—1)/2 when  >4, as we wished to prove. 

2. The equation of Vj in Grassmann-Pliicker coordinates. The equation 
(2), being a determinant of the mth order, is rather unwieldy for the purpose 
of investigating the accessory singularities of a V*~{; even in the case for 
n=5 the analytical work becomes formidable. We shall therefore use the 
Grassmann-Pliicker coordinates and start with the equation of the generic 
Vi which has been derived in a former paper (R, pp. 341-342): we shall then 
find the conditions which must be satisfied in order that it shall be associated 
with a Schlafli hexadt, and thus incidentally obtain the invariants of the 
spread. We shall suppose that Vj has no triple point, that is, no three of the 
fundamental flats intersect. The equation of Vj ist 


6 6 6 6 
vs = 1 1 1 1 - 0 


4 


6 6 
y1 > + y2 > ys + ys 
1 1 1 1 


* John Ejiesland, On a class of ruled (n—1)-spreads in Sn, Rendiconti del Circolo Matematico di 
Palermo, vol. 54, pp. 335-365. By a “generic” V" os is meant here the V% = whose equation is given 
on p. 337. In what follows this paper will be referred to as “R.” In this paper, the V{, here denoted 
as the “generic V,,” is the generalization for n=5 of Segre’s variety in S,. 

t By a Schlafli hexad in S; we mean here six 3-flats in Schlafli position. 

t R, pp. 341-342. 


1934] RULED VARIETIES IN FIVE DIMENSIONS 


or, if we add the elements of the first column to those of the second, 


6 6 6 6 
Days + ye + Daisy: 
3’) 1 1 1 


4 6 6 


6 6 


1 1 


The five fundamental flats are 


6 6 
=0; =0; w= ye Yaw = = 0; 
1 1 


6 6 
Dein = = 0, ase = — ands, Bir = cide — 
1 1 


If now the Vj belongs to a Schlafli hexad, all the lines which meet these five 
fundamental flats must also meet a sixth flat. In order to find such a flat 
we write (3) and (3’) as follows: 
. yi(Li + M1) + yo(Le + M2) ys(Ls + Ms) + ys(Ls + Ms) 


-| yi1Mi + + yoM 
yi(Li + Mi) + yo(Le + M2) ya(La + Ma) + yo(Le + Meo) 


where Li=> Bizyi, R=1, 2, - - , 6. Consider the 3-flat 

(4) y+ 

If it is to be the required sixth flat it must be identical with the two flats 

ys + Ms + Ls = 0, ys — Ms — Ls = 0; + Me + Le = 0, ve — Ma Ly = 


If we set and (ik) =Aik 
+8, this means that the determinant 


0 Pi (13) (15) (14) (16) 
0 (23) (25) (24) (26) 
(31) (32) 0 Ps (34) (36) 
(Si) (52) —Ps 0 (54) (56) 
(41) (42) (43) (45) 0 Pw 
(61) (62) (63) (65) —Pwe 0 


must be of rank 2. We thus obtain the following conditions: 


317 
: 
? 
i 
| 
f 


JOHN EIESLAND 


Py2P35 = (13)(25) + (15)(32), 
(36) Piz = (13)(26) + (16)(32), (56) Piz = (15)(26) + (16)(52), 
(34) Piz = (13)(24) + (14)(32), (54) Paz = (15)(24) + (14)(52); 


= (45) (36) + (34)(56), 
(24)Pss = (23)(45) + (25)(34), (14) Pas = (13)(45) + (15)(34), 
(26)Pss = (23)(65) + (25)(36), (16) Pss = (13)(65) + (15)(36); 


= (16)(42) + (14)(26), 
(23) Pas = (24)(36) + (26)(43), (13) Pas = (14)(36) + (16)(43), 
(25)Pas = (24)(56) + (26)(45), (15) Pes = (14)(56) + (16)(45). 


These relations are not independent; from any six of them the remaining 
ones may easily be derived. We also obtain the following important relations: 


(5,c) 


15813 — 13815 a25813 — 13825 + 15823 — a238 15 25823 — ae3Be5 


34836 — a36834 34856 — 56834 + — 45836 — agsBas 


23813 — a13823 a25813 — a13825 + a23815 — a15823 a25815 — 15825 
(7) 1+2a= 1 + aie + + aus = 0, 1+26 = 1 + + Bss + Bas = O. 


The last two relations are obtained by using (5,b) and (6). Since there are 
six independent non-homogeneous relations between the 16 parameters of a 
generic Vj, the Vj belonging to a Schlafli hexad has 10 essential parameters, 
as was shown before (p. 316) by a different method. 

3. The singular loci on the V} associated with a Schlafli hexad. We know 
that the generic Vj in S; has 5 fundamental 3-flats and that the singular loci 
lie in each of these flats.* In any one flat we have four fundamental double 
lines which are the intersections of the flat with the remaining four flats; 
moreover, two accessory double lines which intersect these in 8 points and, 
finally, a cubic curve which has the four fundamental lines as bisecants. 
Through each of the 40 points pass two double lines of which one is accessory, 
and one cubic; but it is to be noted that this cubic does not belong to the 
fundamental flat in which the accessory line is immersed. We have thus 20 
lines and 5 cubics as the complete set of double loci on a generic V3. 


* R, pp. 344-352. 


318 [April 


1934] RULED VARIETIES IN FIVE DIMENSIONS 319 


In the case at hand the V{ has six fundamental flats in a Schlafli position. 
In each flat are five fundamental double lines and two accessory double lines 
which intersect these in 10 points. The cubic is composite, consisting of three 
lines, namely the fifth double line which is added to the four of the generic 
case, and two lines which remain. We shall prove that these two lines coincide 
with the two accessory lines. It will be sufficient to prove this for any one flat, 
since what happens in one must happen in all the other five flats. 

The singular loci in the 3-flat ye=y;=0 are the complete intersection of 
the two cubic surfaces 


avi av: 


= = 0 


We have then, from (3), 


4 
av 4,6 4,6 

= “=P > = 0, 
1,2 1,2 

4 

av, 4,6 4,6 

= =P > —O ay; = 0. 
1,2 1,2 


(8) 


The four fundamental double lines are 


= V5 = V1 = V2 = = = Va = = O,*7 
= ¥e = Ms =Ms5=0, vs= y= L;=L;=0, 


to which must be added the fifth double line 
(10) y= M;=0, 


(9) 


which is the intersection of the flat y,=y,;=0 with the fifth fundamental 
flat y,+M;+L;=0, ys—M;—L;=0. The two accessory double lines are 


(11) Yi = KY2, Ya = 
where « and yp are roots of the two quadratic equations 
(a14816 — 16814)k? + — + — 
+ — = 0, 
(14824 — + — + — 
+ (24861 — = 0. 
These equations may also be written 
+ + Bae + 26 Bick + Bae 
+ a6; Bam + Ber + O42 Bak + Bas 


(12) 


(13) «= 


y 
} 
| 
— 
i 
i 


320 JOHN EIESLAND  - [April 


The cubic curve being composite, consisting of the fifth double line and two 
additional lines, it is to be proved that these latter coincide with the two 
accessory lines (11); in other words, these lines are tac-loci on the two sur- 
faces ¢; = ¢2=0. Take any point on the line (11), say (xp, p, u, 1). The tangent 
planes to the cubics at the point are 


0¢; ( 0¢; ) ( 09; ) 
=0 = 1,2). 
(2. re + + ay ye + v6 (i ) 


Noting that the accessory lines are generators of the two quadrics 


P = + + + = 0, 
Q yi(Barye + Be1ys) + yo(Bave + Bove) = 0, 


we have 


1 


0 
— (Baik +Ba2) Ms, = (Bork +Bo2) Ms, 


(Bemt+Boi) Ms, = Ls — (Bau +Be2) Ms, 
1 


a 1 
targa) L5— (Bax = (Bex Ms, 
p 


where we have set 


= + Bes)p + Basu + Bes, Ms = (cusk + + + O63, 
= (Bix + Bos)p + Basu + Bos, Ms = (ask + a25)p + + ares. 


If now the line (11) is to be a tac-locus on ¢1=¢2=0 we must have 
(= 
¢é 


— Is—p:sMs_ L 


—~iMs Ls— p2Ms Ly — paMs 


| 
4 
| 
Ls 
’ 
where 


1934] RULED VARIETIES IN FIVE DIMENSIONS 321 


Bam + Ber Basu + Bes _ Bak + Ba > + Bes 


2 » p= » pe 


1 


But the equations (13) show that p:=. and p;= p,, hence we get the single 


condition 
— Ls — psM3 


Ls — piMs 
which is equivalent to the condition L;M;—L;M;=0, true for all values 
of p. We thus obtain the following three relations: 

(Bisk + B23)(arsk + — (Bisk + + a3) = 0, 
(Basu + Bes) (aasu + — (Basu + + = 0, 
(Bisk + Bes)(aasu + 65) — + a25)(Basu + Bes) 

— (Brisk + + — (ask + a23)(Basu + Bos) = 0. 

The third relation is satisfied by virtue of the first two, hence x and » must 

be roots of the two quadratic equations 

(15813 — Biscris)x? + (a15823 — ae3815 + 25813 — 

+ — Bosaes = 0, 

+ agsBes — a638e5 = 0. 

But «x and p are roots of the quadratic equation (12), hence we must have 
14824 — 24861 — + — — 
cubes — — + — Boscus — 


which are true according to equations (6), as we wished to prove. 
If we set A= (ask + a5) /(as1K + @)s2, we have from (13) and (14), 


(16) + Bisk + Bes + Basu + Bes 
+ Baik + Bs2 ase + ase + Bao 

so that \ must be a root of the two equations 

— a31823)A* + (23851 — 51823 + a25831 — 
+ a25851 — = 0, 

+ — = 0, 

which have identical roots, equation (6). 


pi — ps 


(14) 


(15) 


(17) 


4 
pr | 
t 
H 
| 
it 
4 
; 
f 


322 JOHN EIESLAND [April 


We have thus 6-5/2=15 fundamental double lines and 12 accessory 
double lines; these lines intersect in 30 points, three lines through each point. 
No point is a triple point. We shall prove the following important 


THeorem. A V{ in Ss, associated with a Schlafli hexad, has two double 
planes. The 12 accessory lines form a double-six, each set of six lines forming a 
complete hexagon in each plane. The 15 fundamental double lines are outside of 
these planes and join the 15 pairs of corresponding vertices of the hexagon. The 
equations of the planes are 
(18) Ky, Ys=AYs, Ye = 
where x, \ and wp are the roots of the three quadratic equations (12) and (17). 


The proof of the first part of this theorem is immediate. If we substitute 
the values of ¥:, y2, and y3 from (18) in the equation (3), the determinant re- 
duces to one of rank zero, since all the elements vanish, account being taken 
of (13) and (16); to prove the second part we need only show that any one 
of the planes contains the six accessory lines 


0, ye = 0, = Ya = 
(19a) ys=0, ys=0, = Kye, = 
ye = 0,7 ye = O, = KY2, Ys = 


Ky, Ys=Ays, = 0, 
(19b) = Ys=Ays, =0, = 0, 
Yi = y—-Mi—1,=0. 
The second set (19, b) may also be written 
356 466 


Yi = Ys = As Y2 = 


Bs5¥s 
Bisk + Bos + Bra 


Yi = Ys=AVs, Ye= Ue, 


P3sys 
(13) + (14)K + (24) 


Yi = Kye, Ve= Yo 


From these equations it is at once evident that on each of the planes (18) the 
six accessory lines form a complete hexagon, and that any one of the 15 
fundamental double lines joins a pair of corresponding vertices of the two 
hexagons. 


1934] RULED VARIETIES IN FIVE DIMENSIONS 


THEOREM. The 12 sides of the two hexagons are bispatial. 


To prove this we transform the origin (0, 0, 0, 0, 0, 1) to the point (px, 
p, od, , o, 1) on one of the double planes. The tangent cone, the equation of 
which being rather long, we shall not give here, is seen to be reducible for the 
following values of p and a: 


0350 46 _ Base 
+ + Bisk + B23 Biak + Baa 
P3s0 Pu 
(13)e + (23) (14)« + (24) 


These five values correspond to five sides of the hexagon. That the sixth side, 
Vi =KY2, ¥s=AYs, is also bispatial is proved by transforming the 
origin to the point (px, p, A, ur, 1, 7). The cone is reducible when 7 =0. 

From these two theorems we derive the following 


CoroLiary. Given in S; a hexad of 3-flats in a Schlifli position. There exist 
two planes which intersect these flats in 12 lines forming a complete hexagon in 
each plane. The V4, associated with the hexad has the two planes for double planes 
and the 15 lines joining the corresponding vertices of the hexagons are double 
lines on the V;. The 12 lines of intersection are bispatial. 


p= 0, = 0, 


4. Transformation of theV{. In order to carry the Vj into the form given 
by equation (2) for 7 =5 we set up the following transformation: 


5 5 5 
(20) = V1, = 42, = %2 = Ma; = Ve, = 93, 
0 0 1 


from which it must follow that 
6 5 6 

(21) = 23, = Dan = = 
1 0 1 1 


Substituting the values of y; from (20) on the left side of these equations and 
comparing the coefficients of the x’s on both sides we find the values of a,, 
b;, cx, d; expressed rationally in terms of the 0;,. We then calculate the Grass- 
mann coordinates a;x, 8;x. The work is rather long and tedious, but affords a 
valuable check on the correctness of the method we have pursued; in fact, 
the ix, Bix thus found are seen to satisfy the fundamental relations (5,a), 
(5,b), and (5,c). 

The singular loci of the Vj having been found, our work is completed. If 
we had started with the equation (2) for »=5 we should have failed, the 
analytic work being too complicated. 


q 

323 

4 

| 

bf 


24 JOHN EIESLAND-° [April 


5. The self-dual V{ associated with a Schlafli hexad. In a former paper* 
it was proved that if on a generic Vj in Ss any two of the 10 fundamental 
double lines are bispatial, they are all bispatial and the spread is self-dual. 
In the case of the Vj here considered a similar theorem holds: If any one of 
the 15 fundamental double lines is bispatial, they are all bispatial and the V; 
is self-dual. 

Let the double line be y:=y.2.=y3;=ys=0; transforming the origin to a 
point p on this line we set y;=y/, y¥s=y+p¥%, 1=1, 2, 3, 5, 6. The equation 
(3) may then be written, dropping the primes, 


+ + os = 0, 
where 


= + pacar) ¥1 + (ree + porae) ye] + + (Bos + ys] 
— [(Ber + pBar) ye + (Bs2 + pBe2) + + (aes + pores) ys]. 


In order that the point p shall be bispatial the discriminant of this form 
must be of rank 2. We have 


A = + pars) (Bes + pBis) — + paras) (B65 + Bas) ][(B6r + + pores) 
— (Be2 + pBa2)(ae1 + | = (0. 


Since every point p is to be bispatial we must have 
— 43845 = — = 0, — = — = 0, 
— + — = 0, — + — = 0, 


which may also be written 


hence three conditions must be satisfied, the second set being identical with 
the first, as follows from (6). If also the double line ys=y;=0, ys=ye=0 
is bispatial, we get the following two sets of conditions: 


that is, no new conditions are added, if account is taken of equations (6). It 
will not be necessary to carry out the work for the 13 remaining double lines 
as no new conditions are found. If now we set 


* R, pp. 358-360. 


41 42 61 62 45 43 O65 63 
213 15 23 25 42 O62 


RULED VARIETIES IN FIVE DIMENSIONS 


the equation of the Vj in y-coordinates is 


| _ 


(23) = 
102+ 101 — 


0, 


or, when expanded, 
(r — s)Q203 + (¢ — s)Q103 + (¢ — r)Q102 = 0, 
Q1 = + + + 
Q2 = yilaarys + + + 
Os = yilasiys + + yo(aseys + as2¥s). 


(23’) 


The equation of the Vj in tangential coordinates is 


— + (s — U2 + (s — — 4(s — — = 0, 


U,= 


46035 


U + agus) + + 


46012 


U; = 
0135012 
Hence the order and class of Vj are equal, as we wished to prove. 

6. The singularities of the self-dual Vj. Since the conditions (22,a) and 
the resulting conditions (22,b), (22,c), and (22,d) imply that the three 
equations (15) and (17) are indeterminate, it follows that there will be an 
infinite number of double planes instead of only two. These planes are the 
generators of the 3-dimensional quadric Q:=Q.=Q;=0, which may also be 
written 


Yo GQuvet aye + 


Setting each of these ratios equal to a variable a we have the ©! generating 
planes. The six fundamental flats intersect this quadric in six 2-dimensional 
quadrics which are all bispatial. We may therefore state the following the- 
orem: 


1934] 325 
Bar Bas Ber Bes Bis Bis Bas Bas ‘ 

O45 O43 O65 063 iq 

i 

if 

| 

; 


326 JOHN EIESLAND 


The self-dual Vi, associated with a Schlifli hexad has a 3-dimensional quad- 
ric as locus of double points. The six fundamental flats intersect this quadric in 
six 2-dimensional quadrics which are all bispatial. There are 7 such self-dual 


* In R, p. 359, the fact was overlooked that the 2-dimensional quadric x/%2=<s/xs=««/x is 
a double locus on the self-dual Vj. We have then five 2-dimensional bispatial quadrics, one in each 
of the five fundamental flats. 


WEstT VIRGINIA UNIVERSITY, 
MorcantTown, W. Va. 


ON QUASI-COMMUTATIVE MATRICES* 


BY 
NEAL H. McCOY 


1. Introduction. In quantum mechanics there appear infinite matrices p 

and g with the property that 

pg 

where c is a scalar matrix. It is well known? that a relation of this type can 
not be satisfied by finite matrices. However, in the calculation of commuta- 
tion formulas for polynomials in / and gq no use is made of the fact that c is 
a scalar but merely that it is commutative with both » and qg.{ And there do 
exist pairs of finite matrices x, y of the same order such that xy— yx is not 
zero and is commutative with both x and y. Such matrices will be called 
quasi-commutative matrices and either may be said to be quasi-commutative 
with the other. 

In a certain sense the algebra of polynomials in a pair of quasi-commuta- 
tive matrices is homeomorphic to the algebra arising in quantum mechanics. 
It is hoped to discuss such algebras in some detail in a later paper. In the 
present paper we shall make a brief study of quasi-commutative matrices 
whose elements belong to the complex number field. 

The concept of quasi-commutativity is an extension or generalization of 
commutativity, and as would be expected, some of the results obtained are 
generalizations of known theorems concerning commutative matrices. 

The problem of determining quasi-commutative matrices is that of finding 
matrices x, y, z (+0) which satisfy the equations 


— yx = XZ = 24, ys = 


If z is an assigned matrix, there may or may not exist matrices x and y such 
that (x, v, z) is a solution of these equations. In §3, we shall characterize 
those matrices z for which these equations do admit a solution. 

If x is a given matrix, it is clear that there always exist matrices commu- 
tative with x but it is not evident whether there exist matrices quasi-com- 
mutative with x. It will be shown (§4) that in most cases there do exist such 
matrices, and a necessary and sufficient condition for their existence is ob- 


* Presented to the Society, December 27, 1932; received by the editors June 21, 1933. 

T See, e.g., Birtwistle, The New Quantum Mechanics, p. 67. 

t See a previous paper, On commutation formulas in the algebra of quantum mechanics, these 
Transactions, vol. 31 (1929), pp. 793-806. 


327 


| 
f 
Rg 
| 
| 
| | 
| 
| 
| 
| 
| 


328 N. H. McCOY ‘ [April 


tained. The general form of a matrix quasi-commutative with x is also given 
for the case in which x has a single elementary divisor corresponding to each 
root. 

In §5, we prove a theorem about the roots of any scalar polynomial 
¥(x, y) in quasi-commutative matrices x and y. It is shown that if the roots 
of x and y are ); and y; respectively, then the roots of ¥(x, y) are all of the 
form ¥(A;, u;). This is merely an extension of a known theorem concerning 
commutative matrices.* 

2. Commutative matrices. In this section we shall make some preliminary 
remarks and then mention a few known properties of commutative matrices 
which will be needed in later sections. 

Let x be a given matrix of order n, with the elementary divisors 
(A—A,)* (¢=1, 2,---, 7). Then by a proper choice of basis, x may be ex- 
pressed in the canonical form{ 


X, 0 


0 O 


where X; is the matrix of order #,f, 
1 0--- 
0 1--- 


0 0 1 
0 0 0---0 


If now we write X;=\,e;+7;, the matrices e; and »; may be called re- 
spectively the partial idempotent element and the partial nilpotent element§ of 
x corresponding to the elementary divisor (A—),)#. 


* Frobenius, Uber vertauschbare Matrizen, Sitzungsberichte der Preussischen Akademie der Wis- 
senschaften zu Berlin, 1896, pp. 601-614. 

1 Cf. Bécher, Higher Algebra, p. 289. 

t We shall think of X; as a matrix of order #; or one of order n at pleasure. It will be clear from 
the context as which it is being considered. 

§ Cf. J. H. M. Wedderburn, The automorphic transformation of a bilinear form, Annals of Mathe- 
matics, (2), vol. 23 (1921), pp. 122-134. In general a matrix a (+0) such that a*=c is said to be 
idempotent, a matrix b with the property that b”=0, b’-'0, is nilpotent of index v. The index of 7; is 
clearly 


0 
(1) 
..- Zz, 
0 0 
0 0 


1934] QUASI-COMMUTATIVE MATRICES 329 


Let A; (¢=1, 2, - - - , 2) be.the distinct roots of x and denote by ¢; the sum 
of all the partial idempotent elements of x corresponding to elementary di- 
visors involving the same root \;. The matrix ¢; is called the principal idem- 
potent element of x belonging to the root ,. It is of considerable importance 
that the principal elements of a matrix x may be expressed as scalar poly- 
nomials in x.* 

If, in (1), we combine those blocks X; corresponding to elementary di- 
visors with the same root X; into a single block X“, we may express x in the 


form 


(3) 


0 @ 


The principal idempotent element of x belonging to ; is in this case a matrix 
of order ” with 1 in each element of the principal diagonal corresponding to 
the position of the diagonal of the submatrix X“ in x, and zeros elsewhere. 

We now discuss briefly the form of a matrix y commutative with a matrix 
x in the canonical form (1). If we set eye; = Y;;, we may write y in the form 


(4) 


where Y;; is a rectangular matrix of p; rows and p; columns.f The equation 
xy=yx is then equivalent to the set of equations, 


(5) i; Y,;X; (i, 7 = 1, 2, r). 


Let now s and ¢ be fixed values of 7 and 7, and consider the single equation 
for 


(6) = 


The following facts are known concerning the equation (6).f If \,#,, the 
only solution is Y,,=0. If \,=A; and ~,=f:, the general solution is of the 
form 


* Wedderburn, loc. cit., p. 126. 

t By actual definition, Y;; is a matrix of order , but there will be no confusion as it has non-zero 
elements only in the rectangular block indicated in the array. 

t H. Kreis, Contribution a la Théorie des Systemes Linéaires, Zurich Thesis, 1906. See also Hilton, 
Homogeneous Linear Substitutions, Oxford, 1914, chapter 5. 


0 x®...9 
Yu Vy 
Ya Yoo: + Vor 
y= 
| 


N. H. McCOY 


a3 


O ay de 


where 4, d2,- ++, @p, are arbitrary parameters. Similarly if \,=A, and 
Yar takes the form 


-0 


0 a a--- 


We may now write at once the general form of a matrix y commutative with 
x, as we only need to solve each of the 7? equations (5) for the Y,; by these 
rules, and substitute in (4). 

The following application will be of importance in the sequel. Let (x, y, 2) 
be a solution of the equations 


(9) xy — yx = 2, = yz = 2y. 


If S is any non-singular matrix of the same order, then clearly (SxS-', SyS-', 
SzS-) is also a solution of these equations. Hence we may, without loss of 
generality, assume that x is in canonical form (3).* Set VY“) =@,y¢;, 
Z) =$,2g;, where the ¢; are the principal idempotent elements of x. The 
first of equations (9) may be replaced by the set of equations 


X@yGn — = (i,j = 1,2,---,2). 


But x and z are commutative and hence, by the above results, Z“? =0 if 
i1~j and thus also Y“*? =0 if i+7. Hence y and z must be of the types 


* Cf. Bécher, op. cit., p. 283. 


330 April 
O a dps 
(7) 000 a2 
000 0-:---90 a; 
000 0 
0000---0 0 
O---0 
0---0 0 O ay, 
0---0 0 0 0---a a2 
0---0 00 a; 


QUASI-COMMUTATIVE MATRICES 


yoog Zang 
0 


where the elements not in the diagonal blocks are necessarily zero. Thus the 
equations (9) are equivalent to the equations 


— = ZH, XOZW = = ZUWYGo 
(¢=1,2,---,2). 


If />1, the solution (x, y, z) of the equations (9) is therefore reducible and 
may be said to be the direct sum of the solutions (X, Y“®, Z@®) (¢=1, 2, 
- + +, 2). We have thus shown that if (x, y, z) is a solution of the equations 
(9) and x (or y) has more than one distinct root, the solution is reducible. 
Since $:6;=$,6;=0 and we see that diy = 
This proves the 
THEOREM 1. [f x and y are quasi-commutative matrices, the principal idem- 
potent elements of either matrix are commutative with the other. 


This theorem is of course obvious for commutative matrices as the prin- 
cipal elements of a matrix are polynomials in that matrix. 

3. Characterization of the matrix z. Let (x, y, z) be a solution, in mat- 
rices of order n, of the equations (9). We shall assume, as we may, that z is 
in canonical form. If z has more than one distinct root, the solution (x, y, 2) 
is reducible as x and y are commutative with z. Hence let z have the single 
root a. From the first of equations (9), we see that 


trace z = 0 = na*, 


that is, a=0, and z is nilpotent. 
Let the elementary divisors of z be 


k 


t==1 


We shall find it convenient to use the symbol [m, m2, - - - , mx], called the 
characteristic of z, to denote the degrees of these elementary divisors when ar- 
ranged in the order indicated. We shall now prove the following theorem: 


* The trace of a square matrix is defined as the sum of the elements in the principal diagonal. 
t Cf. Bécher, op. cit., p. 287. 


1934 331 
0 0 


332 N. H. McCOY [April 


THEOREM 2. If z is a given matrix of order n, necessary and sufficient condi- 
tions that there exist matrices x and y of order n satisfying the equations 


(9) LY — yx = 2, XZ = 2x, yz = 


are that z be nilpotent and that it have a characteristic of the type [m, m2, - - - nx], 
where and n;—nj4,=0 or 1 (i=1, 2,---, k-1). 

We first establish the necessity of these conditions. It has already been 
shown that z must be nilpotent. Assume that z does not have a characteristic 
of the form stated in the theorem but that there exist matrices x and y satis- 
fying the equations (9). We may write the characteristic of z in the more ex- 
plicit form 


1—1); but either or nn—M141,1>1. In either case mn 22. 

We shall now assume that z is in canonical form. Let f,,, be the partial 
idempotent element of z corresponding to the elementary divisor \"*", and 
set fn, Xfnj, fn, = fn, = Then =0 un- 
less i=j, r=s, and Z*‘r"* is the partial nilpotent element of z corresponding 
to the elementary divisor \**. 

Now X**r"* is a rectangular matrix of m;, rows and m;, columns, and since 
x is commutative with z, it will be of the general type (7) if <7 and of the 
type (8) if i=7. We shall find it convenient to denote the element of X*#r#« 
corresponding to a in (7) or (8) by a,*, that corresponding to az by. b;*. 
Similarly these elements in Y**"#+ will be denoted by a,* and 6," respectively. 
We shall also let X**"* denote the element in the ath row and bth column 
of X*i*, and similarly for y and z. 

Let i</, p<k; be fixed positive integers. Then from the first of equations 
(9) we get the equation 


m k; 
(10) DSS — = Zrivrip, 


j=1 


for the part of z in the (ip, mip) block. But m;,22 and hence Z}/?"”=1. 
Thus from (10) we have 


m kj n 


j=l 


It may be verified from the form of x and y that X7j""? = Yj" =0 if v>2, 


iz 


1934] QUASI-COMMUTATIVE MATRICES 333 


and thus in the sum (11) we may limit v to the values 1, 2. If 7<i—1, 
= =0 (v=1, 2); and if j>i+1, (y=1, 2). 
Hence 7 may be restricted to the values i—1, 7, 7+1, the first clearly being 
omitted if 7=1. If i=/, there are, under our hypotheses, two cases to be con- 
sidered according as /=m or 1<~m. In the first case clearly 7 can not take the 
value i+1. If i=l<m, we have assumed that #n—141,:>1, hence 
— — 0) (y=1, 2). Thus in either case if i=/, the index j 
may be restricted to the two values 7—1, 7. 
Using the notation introduced above, we get from (11) 


— gri-lrqip 
Nip Ni-i,r 


+ > + — — ans 


Nie ni ni Nip nip 
oni is ip is ip ip te ip te 
ki+t 


+ (on — gnriti.t ) =1 


t=1 nip nip niti,t 


=1,2,--+, 


with the understanding that the first sum is to be omitted when 7=1, the last 
when i=/. Let us now sum the equations (12) with respect to the index p. 
The middle sum then vanishes. Hence if / = 1, the sum of the left members of 
the equations (12) vanishes, which is clearly impossible. Suppose />1. If we 
denote by K(u, v) the double sum 


» ( — a™raup 


p=1 r=1 "up "up 


the resulting / equations may be written in order as follows: 


K(1, 2) ki, 
K(2, 1) + K(2, 3) = ke, 


K(i—1,1—2)+ = kis, 
K(l,t—1) = kp. 


But K(u, v)=—K(v, uw), and hence if we add the equations (13) we get 
0=>-'.,4:, which is impossible as /=1, k;=1. Therefore there can exist no 
matrices x and y satisfying the equations (9). We have thus established the 
necessity of the conditions stated in the theorem. 

Let us now assume that z has a characteristic of the type prescribed by 


) 
(13) 


334 N. H. McCOY [April 


the theorem and exhibit a pair of matrices x and y satisfying the equations 
(9). There is no loss of generality in assuming that the elementary divisors 
of z are not all linear, as in this case any pair of commutative matrices will 
satisfy the equations. Let the characteristic of be [m, m2, Me, Me41, 

-+, my], where k>t+1, +--+ =m=1, or 
1 (¢=1, 2,---, ¢-—1). If f,, is the partial idempotent element of z corre- 
sponding to the elementary divisor \"*‘, we may again break any matrix A 
of order m into rectangular sub-matrices A"*"i =f, ,Af,;, and may call this the 
(n;, 2;) block of A. 

Let ¢7*" denote the matrix of order m for which the element in the pth 
row and gth column of the (m;, ”;) block is 1 and all other elements are zero. 
Then 
iff = 4, 


erinignrinn = { 
0,if7 Alorg#r. 


ve 


With this notation we have 


If now we set 


> eniting (i 
por 


it may be verified that each A; and B; is commutative with z and further that 


A;B i=j, 
ivy, 


i=jHFt, 
0,ixjori=j=t. 


(14) 


Hence if we set 
t 


t 


j=1 j=1 


then x and y are commutative with z and by the relations (14) we find 


t t—1 t 
xy— yx = > = = Z. 


j=1 j=1 


t t n;-l 
ni—l 
A; = eniniti t 
ni+l 


1934] QUASI-COMMUTATIVE MATRICES 335 


Thus we have exhibited a solution of the equations (9), and the proof of 
Theorem 2 is completed. 
The following corollary follows readily. 


Coro.iary. There exist no quasi-commutative matrices of order two. 


For if »=2, the only choice of a characteristic for z which satisfies the 
conditions of Theorem 2 is [1, 1], that is, z=0, and x and y are commutative. 

4. Matrices quasi-commutative with a given matrix. Let x be an as- 
signed matrix of order m. We shall in this section find necessary and sufficient 
conditions that there exist a matrix v quasi-commutative with x, and shall 
determine the general form of all such matrices for the special case in which 
x has a single elementary divisor corresponding to each root. 

We first consider the case in which x has just one elementary divisor. 
There is no loss of generality in assuming that the root of x is zero,* hence 
if x is put in canonical form we have 


(15) > €4,i41-T 


If ~=1 or 2, there exist no matrices quasi-commutative with x. Hence 
assume 2 =>3. Now if y and z are matrices such that (x, y, z) is a solution of 
equations (9), z is commutative with x and must also be nilpotent. Hence it 
must be of the form 


(16) Z= + agx? +--+ + 


where the a; are scalars. 
If we write y=)_4;,e;;, the equation xy—yx =z becomes 


i 
r,@ 
‘ 


From this it follows that --- Thus if i> 
=y,;=0. If Vi+l,i41 = Vig from which we see that 


Vig = — 1) (¢ $7). 


* For if x and y are quasi-commutative, so are x— and y, where ) is any scalar matrix. 

+ The matrix unit ep, is a matrix with 1 at the intersection of the pth row and gth column, and 
zeros elsewhere. The rule for multiplying these units is epg¢rs=0 (gr), Cpe ge=Cpe- It will be con- 
venient to let e¢p,=0 if either or q is greater than or less than 1. 


336 N. H. McCOY 


If then we set yi1;=¥;, y is of the form* 

0 V2 + ae ¥3 + a3 Gn-1 
0 0 Yi t yet 2d2-- Yn-2 + 2dn-2 


0 0 0 0 (nm — 


If x is the matrix (15), then y and z defined by (17) and (16), where 
Vi, Yo, * Yny G1, , are arbitrary parameters, are the most gen- 
eral matrices satisfying the first two of equations (9). We wish now to make 
necessary restrictions on y and z in order that they may be commutative. 
Let y given by (17) be expressed in the form y= Y,+Y2, where Y1=y1+ 42% 
+ysx?+ --- +y,x"-!. Then y will be commutative with z if and only if VY; 
and z are commutative. Since the elements of Y; depend only upon the 
parameters a;, the only restriction will be on these parameters, that is, on the 
matrix 2. 

In (16) suppose a, #0 (154 <n—1) but a;=0 (¢<k). If positive integers 


p, g are defined by 


then it is knownf that z has g elementary divisors of degree +1 and k—q 
of degree ». Hence by Theorem 2, there can exist matrices x and y satisfying 
equations (9) only if p=1, that is, k=>(m+1)/2. Let & be the smallest integer 
satisfying this inequality. Then the elements of z are all zero except in the 
block of »—k rows and »—k columns in the upper right hand corner. Simi- 
larly Y2 has non-zero elements only in the block of »—k+1 rows and n—k 
columns in the upper right hand corner. Hence zY;=Y2z=0, and we have 
thus established the following theorem: 


THEOREM 3. If x is the matrix (15) of order n=3 and k is the smallest integer 
not less than (n+1)/2, then the general form of a matrix y quasi-commutative 
with x is given by (17) where a;=0 (i=1, 2,---, R—1), and az, ---, 
Gn—1, Vi, V2, * Vn Ore arbitrary parameters. 


* Cf. R. Weitzenbick, Uber die Matrixgleichung Ax+-xB=C, Akademie van Wetenschappen te 
Amsterdam, Proceedings, vol. 35 (1932), pp. 54-59. The form (17) obtained for y is a special case of 
a formula given by Weitzenbéck. 

t H. Kreis, op. cit., p. 47. See also D. E. Rutherford, On the canonical form of a rational integral 
function of a matrix, Proceedings of the Edinburgh Mathematical Society, (2), vol. 3 (1932), pp. 
135-143. 


[April 
< k, 


1934] QUASI-COMMUTATIVE MATRICES 337 


This theorem is sufficient to give the form of y even if x has more than one 
root, provided it has only one elementary divisor corresponding to each root 
For by the results of §2, the general form of such a y is the direct sum of 
matrices of the type prescribed by this theorem. If x is not of this simple type 
it seems to be difficult to give the general form of a matrix quasi-commutative 
with x and we shall now limit ourselves to a consideration of the conditions 
under which such matrices exist. 

Let x be a matrix of order » and let us separate the elementary divisors of 
x into two sets A; and A;. By a proper choice of basis x may then be expressed 
as the direct sum of two matrices, thus 


Xe 


where the matrix X; has the elementary divisors A; and X; the elementary 
divisors Az. If now Y; is a matrix quasi-commutative with X,, then 


lo’ ol 


will be quasi-commutative with x. Hence there will exist a matrix quasi- 
commutative with a given matrix x if there exists a matrix quasi-commuta- 
tive with a matrix the set of whose elementary divisors is a subset of the ele- 
mentary divisors of x. We may now prove the following lemma: 


Lemma. Let x be a matrix of order n with a single root \y. A necessary and 
sufficient condition that there exist a matrix y quasi-commutative with x is that 
n be greater than two and the elementary divisors of x be not all linear. 


The necessity of these conditions follows from the Corollary to Theorem 2 
and from the fact that if the elementary divisors of x are all linear, x is a 
scalar matrix and is thus commutative with all matrices of order n. 

We now prove that these conditions are sufficient. If »>2 and the ele- 
mentary divisors of x are not all linear then (i) x has an elementary divisor 
of degree =3, or (ii) x has at least two elementary divisors of degree 2, or 
(iii) x has one elementary divisor of degree 2 and at least one of degree 1. 
In the first case, Theorem 3 establishes the existence of a matrix y quasi- 
commutative with x. The existence of such a matrix y in the cases (ii) and 
(iii) is shown by the two examples 

it 12 + ex, = eu, 
and 


x =i + é12, = €23, 
respectively. 


| 

| 

| 
| 
4 


338 N. H. McCOY [April 


The following theorem follows readily. 


THEOREM 4. A necessary and sufficient condition that there exist a matrix y 
quasi-commutative with a given matrix x is that for some root d; of x the sum of 
the degrees of the elementary divisors of x associated with d, be greater than two, 
and at least one of these elementary divisors be not linear. 

5. Roots of a polynomial in quasi-commutative matrices. In this section 
we shall prove the following theorem which is well known for the case of 
commutative matrices.* 

THeEoreEM 5. If x and v are quasi-commutative matrices with principal idem- 
potent elements R; and S; (i, 7=1, 2, - ++) and corresponding roots d; and py; 
respectively, then the roots of any scalar polynomial (x, y) in x and y are 
V(Ai, uj), where t and j take only those values for which R;S; #0. 


We first prove two lemmas. 
Lemma 1. If (x, y) is any scalar polynomial in the quasi-commutative 
matrices x and y, and 2=xy—yx, then 
v(x, y) vi(x, y) + z), 
where yr is of the form >_a;;x+yi, the a;; being scalars. 
It is clear that by repeated substitutions of the type yx =xy—z, ¥(x, y) 
can be reduced to the form )-a;;,2‘x’y*. Hence we only need to set 


Lemma 2. If x and y are quasi-commutative matrices and 2=xy— yx, then 


l=0 


where the m; and n; are any positive integers and P =>~}_,m;, 


The sum of the exponents of « which appear explicitly in a given term 
may be called the degree of the term in x, and similarly for y. For example, 
the term z‘x?y‘x?y is of degree 4 in x and 5 in y. If we think of z as being 
replaced by xy —yx, then the total degree in x and y would be 17. For conven- 
ience, let us call this the weight of the term. 

As above, we may express x™y"xa™y" - - - «™+y"* in the form 
(18) 

* Frobenius, loc. cit. The method of this section is a modification of that used by Wedderburn, 
loc. cit., p. 127. 

t It is not necessary that x and y be quasi-commutative in order that the roots of ¥(x, y) shall 
be of the form ¥(\j, u;). Cf.G.S. Bruton, Certain aspects of the theory of equations for a pair of matrices, 


and M. H. Ingraham, A study of certain related pairs of square matrices. Abstracts of these papers 
appear in the Bulletin of the American Mathematical Society, vol. 38 (1932), p. 633. 


1934] QUASI-COMMUTATIVE MATRICES 339 


by repeated substitutions of xy —z for yx. Each time this substitution is made 
in a term we get two terms; in one the degree in x and in y is the same as be- 
fore the substitution, in the other the degree of each has been reduced by 
one. The weight is invariant under a substitution of this form. Hence each 
term of (18) is of the type where 7 =P —1, k=Q-—1, 2i+j7+k=P+0, 
that is, 7=/. 

We may now proceed with the proof of the theorem. The principal ele- 
ments R; and S; are polynomials in x and y respectively, and by Theorem 1 
are commutative with each other and with x, y and z=xy—yzx. If we set 
T;=R,S;, then T;;,T,,.=0 if ixp or Further those 
which are not zero are linearly independent, for if >a;;7;;=0, then 
pq =0 and thus T,,=0 unless a,,=0. 

Let us now write 


z= — (x — 7= — (y — ws) 


The matrices (x—X,)T;; and (y—y;)T;; are then nilpotent. If y(x, y) is a 
polynomial in x and y, we have, by the first lemma, 
v(x, y) = y) + z¥2(x, y, 2), 


where (x, y) =)>_a,;x*yi. Since no interchange of order is necessary, we may 
write, as in the commutative case, 


where the yi are scalar constants. Thus 


ip 

t 


ig 


¥(x, ¥) = — (y — Ts | + 


with the understanding that r and s are not to be zero simultaneously. 
Let us set 


A= B= — (y — Tes] + 202, 9,2). 
1,7 

Then y¥(x, y)=A-+B, and A and B are commutative. It will be shown below 

that B is nilpotent. Hence the roots of ¥(x, v) are the roots of A and these are 

of the form y,(A;, where T;;~0.* But w;) =W(Ax, w;) and the theorem 

is established. We shall now complete this proof by showing that B is nilpo- 

tent. 


* Wedderburn, loc. cit. 


tay. 
A 
i 
A. 
i 


340 N. H. McCOY 


Let 


B, = he — (y — Ty = 


Since T;; is commutative with each A,;, we see that 


k k 
By = 
‘i 
We can thus show that B, is nilpotent by showing that each A,;T;; is nilpo- 
tent. Let 4,=x—X;, y:=y—y;; then x, and y, are quasi-commutative and 
— = 2. Let p; be the index of the nilpotent matrix x,R;, pz that of S;, 
and ¢ that of z. Let N =¢+max (1, p2) and consider 


2N ijre aN 


The right hand side will, when expanded, consist of a sum of terms of the 
general type -- - But by Lemma 2, this may be 
put in the form 


l 


Here P =>>m;, Q=>-n,, and as r and s are not both zero, either P=N or 
Q2N. Now the term in this last sum of which a; is the coefficient is zero pro- 
vided /=t or P—l=p,; or Q—l=p2. Suppose, for example, that P=N. Then 
the term containing a; is zero provided /=>tor/<P—p,. But P—p,=>N —p, 21, 
and thus all / are included. Hence (A;;7;;)?” =0 and B, is nilpotent of index, 
say, r. It follows that 


Brt = [(B, + = [Birr + = 0. 


Hence B is nilpotent and the proof of the theorem is completed. 

It may be noted that the fact that z=xy—yz is nilpotent is a special case 
of this theorem. For if we choose (x, y) =xy— yx, the roots of y must be of 
the form Aw; =(), 


Smita COLLEGE, 
NORTHAMPTON, Mass. 


| 


ON THE DISTRIBUTIONS OF THE ZEROS OF SUMS OF 
EXPONENTIALS OF POLYNOMIALS* 


BY 
L. A. MacCOLL 


1. INTRODUCTION 
Certain work by other investigators suggests the problem of determining 
the distribution of the zeros of a function of the form 


J 
(1) f(z) = exp +--+ + Anz + Ajo), 

i=0 
where J and N are positive integers, and the \’s are real or complex con- 
stants. The present paper gives the chief results of a study of this problem. 
In order to add precision to the problem, and in order to exclude certain ex- 
treme cases which are of minor interest, it is assumed (1) that we do not have 
Nov =Aw= =Ayy; (2) that we do not have Aon =Ain= for 
any n<WN; (3) that for no two distinct values, 7’ and 7’’, of 7 do we have all 
of the W relations Ajn=Ajrn, N=1, 2,---, N. For the sake of brevity, a 
function of the form (1) which satisfies these conditions will be called an 
E-function. The integer N will be called the exponent of the E-function. 

The problem discussed here is essentially a generalization, in one direc- 

tion, of a problem that has already been the subject of numerous studies. 
C. E. Wilder, Tamarkin, Pélya, Schwengeler, and otherst have studied the 
distributions of the zeros of functions of the form 


J 
f(z) = Ax(z) exp (A32), 


j=0 


where J is a positive integer, the \’s are constants, and the A,(z)’s are an- 
alytic functions which behave essentially as powers of z when |z| is large. Our 
generalization consists of replacing the linear exponents \,z by the general 
polynomials \,vz"+ - - - +A,o. At the present time we shall not consider the 
still more general case in which we have non-constant coefficients A,(z) of 
the type described above; for the theory is complicated at best, and the chief 

* Presented to the Society, October 29, 1932; received by the editors April 10, 1933, and, after 
revision, November 16, 1933. 

t The previous work is conveniently summarized in the following expository paper: Langer, 
On the zeros of exponential sums and integrals, Bulletin of the American Mathematical Society, vol. 37 


(1931), pp. 213-239. As this paper contains a rather full bibliography, it is unnecessary to give one 
here. 


341 


| 

int 

ih 

ig 

im 

h 

| 


342 L. A. MacCOLL [April 


of the new phenomena, due to the generalization of the exponents, appear 
even when the coefficients are constants.* Broadly speaking, we find that the 
results for the general value of N are similar to, but more complicated than, 
the known results for the special case in which VW = 1. The methods used here 
do not differ fundamentally from those that have been used in the earlier 
studies; but, naturally, the analysis is considerably more intricate, both 
formally and otherwise. 


2. NORMAL FORM OF f(z) 


We have written f(z) in the form (1) in order to display the structure of 
the function in the clearest possible manner. In order to be able to work with 
the function effectively, however, we shall rewrite it in a certain normal form. 

The numbers Xoy, - - - , Azw may be all distinct or not. In either case, by 
properly collecting terms if these \’s are not all distinct, we can write f(z) in 
the form 

M 
(2) $2) = fm(z) exp [gm(z) + ], 

m=0 
where M is a positive integer, the numbers yo, - - - , ww are distinct, gn(z) 
is a polynomial which, if it is not zero, is of degree less than N, and f,,(z) is 
either the constant 1 or an E-function having an exponent V,,<N. 

If any one of the functions /,,(z) is an E-function, we arrange it as we 
have just arranged f(z). Thus we write 


(3) = fan(s) exp [gmn(s) + 


with stipulations similar to the above. 
Likewise, if any one of the functions f»»(z) is an E-function, we write it 
in the form 


Mann 
Smn(z) = Smnp(z) exp [gmnp(2) + 


with similar stipulations. 

We continue this process of arranging f(z) until it automatically termi- 
nates after a finite number of steps. It will be observed that the resulting 
normal form of f(z) is not unique; for a given sum of exponentials of poly- 
nomials, which is not a constant and not the exponential of a polynomial, 
can be separated in various ways into two factors, one of which is an E-func- 
tion and the other of which is the exponential of a polynomial. For example, 
if we have fn(z) exp gm(z) =exp 2?+exp (2?+2), we can set 


* Constant coefficients are effectively provided for by the constants Ajo. 


n= 0 
p=0 


ZEROS OF SUMS OF EXPONENTIALS 


Im(z) = e™* (z) = 22 + dz, 


where A is any constant. However, for our purposes the normal form of f(z) 
is effectively unique. 


3. CRITICAL POLYGONS AND CRITICAL RAYS 


The distribution of the zeros of f(z) is closely related to certain geometri- 
cal figures which will now be defined. 

Let the points yo, - - - , uw be plotted in the complex plane. These points 
are distinct, and there are at least two of them. Let the smallest convex 
polygon that contains these points in its interior or on its boundary be drawn. 
If the points po, - - - , usr are collinear, the polygon is to be regarded, in an 
obvious way, as having just two sides, which are coincident and face in op- 
posite directions. We call this polygon the primary critical polygon for f(z), 
and we denote it by the symbol P. 

For the present, we assign subscripts so that uo, - - - , wa are the ver- 
tices of P in counter-clockwise order, an arbitrarily chosen vertex being called 
Mo. If M’ <M, the points wy-41, - - - , wa are in the interior of P or on the sides 
of P between the vertices. Let the side of P that follows the typical vertex 
Ma in counter-clockwise order be denoted by L.. From any point on L, draw 
a normal exterior to P. Let ¢. (0<¢.<27) be the angle between the positive 
real axis and this outward-drawn normal. Let the vertex denoted by po be 
selected so that the ¢’s increase with their subscripts. From the origin we now 
draw the (M’+1)N rays R, represented by the equations 


(4) Ry” amp z = — (¢. + 26n)/N;a =0,---,M’;8 =0,---,N—1. 


We call these rays the primary critical rays for f(z). It is to be noted that if 
the value of 8 is fixed, the M@’+1 rays corresponding to the several values of 
a, in increasing order, succeed one another in clockwise order and are con- 
tained in a sector of angular opening 27/N, and that the N sectors of this 
kind, corresponding to the several values of 8 in increasing order, succeed one 
another in clockwise order without overlapping. 

The primary critical rays divide the plane into (M’+1)N sectors, each of 
which has its vertex at the origin, is bounded by two of the rays, and has 
none of the rays in its interior. We denote the one of these sectors that is 
bounded on the clockwise side by the ray R. by the symbol S,. Each of 
these sectors is understood to be an open point set. 

As just above, let a be an arbitrarily chosen one of the integers 
0,---, M’. If the function f.(z) is not a constant, we construct its primary 


yy 


1934] 343 
é 
| 


344 L. A. MacCOLL 


critical polygon P., its primary critical rays* 


RS: 


and the associated sectors 5‘). 


Let @ be one of the integers 0, - - - , M’, and let 8 be one of the integers 
0,---, MZ (assuming that f.(z) is not a constant). If the function fas(z) is 
not a constant, we construct its primary critical polygon P.s, its primary 
critical rays R® , and the associated sectors S®.. 

We continue this process of constructing polygons, rays and sectors until 
it automatically terminates, as it must, after a finite number of steps. It is 
to be noted that if m>M’, we do not construct the figures for f,,(z); and that 
if OSa<M’,n>M_J, we do not construct the figures for fan(z). A similar 
remark applies also to the further cases. 

We have defined the primary critical rays for f(z); now we proceed to de- 
fine critical rays of “higher order” for f(z). A secondary critical ray for f(z) is 
a primary critical ray for a function f.(z), a=0, - - - , M’, which lies in one of 
the sectors S$, ---, S&%-». A tertiary critical ray for f(z) is a secondary 
critical ray for a function f.(z), a=0, - - - , M’, which lies in one of the sec- 
tors S{, - - - ,S.%-. Critical rays of other orders are defined similarly. An 
essential feature of thése definitions is the fact that we have the same sub- 
script in the symbols f,(z), S{, - - - , S{¥-». It is understood that the defi- 
nitions of the critical rays of higher orders of the functions f.(z) are precisely 
analogous to the definitions of the critical rays of higher order of f(z); hence 
the definitions of the latter rays are complete. In this way we get a finite set 
of critical rays for f(z), and these critical rays are classified as primary, sec- 
ondary, tertiary, etc. At the very least there are two primary critical rays. 
Whether or not there are any critical rays of higher order depends on the 
structure of the particular function under consideration. 


4. ZERO-FREE REGIONS 


The first result concerning the distribution of the zeros of f(z) is stated in 
the following: 


THEOREM 1. There exists a set of half-stripst, equal in number to the critical 
ravs of f(z), each extending in the direction of a different one of these critical 
rays, such that each zero of f(z) is a point of one or more of these half-strips. 


* M,' has the same significance in regard to f.(z) that the previously defined symbol M’ has 
in regard to f(z). 

t By a half-strip we shall always mean the open set of points between two parallel straight 
lines and on one side of a line perpendicular to these. 


[April 


1934] ZEROS OF SUMS OF EXPONENTIALS 


The theorem is an immediate consequence of the following 

Lemma. There exists a set of half-strips, equal in number to the critical rays 
of f(z), each extending in the direction of a different one of these critical rays, and 
there exist two positive constants, A and B, such that if |z| =A, and if z is not 
in any one of the half-strips, we have the relation 


(5) | f(z) | = exp [— B|z|¥]. 


The lemma will be proved by a process of induction. The reasoning in the 
following §4.1 proves the lemma directly for the case in which NW =1. It will 
be shown that if NV >1, the truth of the lemma is a consequence of the as- 
sumed truths of the corresponding lemmas concerning £-functions having 
exponents less than NV. 

The reasoning just referred to consists of two main steps. In the first 
place, we shall show that if we enclose each primary critical ray of f(z) ina 
sector, with vertex at the origin and of arbitrarily small angular opening, and 
if we construct certain half-strips each extending in the direction of a different 
non-primary critical ray, we have a relation of the form (5), provided |2| 
is large and provided z is ‘not in any one of these sectors or half-strips. In the 
second place, we shall show that a relation such as (5) obtains at each suffi- 
ciently distant point of a small sector enclosing a primary critical ray, pro- 


vided the point does not lie in a certain half-strip extending in the direction of 
the ray. 

As most of the analysis used in the proof possesses no unusual features, 
many of the details will be left to the reader. 


4.1. PRooF OF THE LEMMA 


Let a be an arbitrarily chosen one of the integers 0, - - - , M’, and let 6 
be an arbitrarily chosen one of the integers 0, - - - , N—1. Let € be a positive 
number such that 2¢ is less than the angular opening of the sector S.) . Con- 
sider the sector 2. that is defined by the appropriate one of the following 
relations: 

— 28x — Ne)/N < S — + 264 + Ne)/N; 
-,N-1: 
Za : — 282 — Ne)/N S ampz S — (om + 264 — 20 + Ne)/N; 
a=B=0: 


x: — 2x S ampz S — — € — (ou — 20) /N. 


Each point of 2.{), except the origin, is a point of S.. 


345 


346 L. A. MacCOLL [April 


If the sector S,) contains no critical ray of f(z), by the hypothesis for the 
induction we have a relation of the following form, provided z is in 2. and 
|z| is sufficiently large: 


(7) | fa(z) exp ga(z)| = exp [— B|z|¥—*], Ba positive constant. 


If S.{) contains one or more critical rays of f(z), we take € so small that 
>.) contains all of these rays in its interior. Then to each of these rays there 
corresponds a half-strip, extending in the direction of the ray, such that if z 
is in 2), but is not in any one of these half-strips, and if |z| is sufficiently 
large, we have a relation of the form (7). We denote the set of points in 
>.f* , and not in any of these half-strips, by the symbol 7 . If S.) contains 
no critical rays of f(z), we use 7.) as an alternative symbol for the sector 
Df), 
We write f(z) in the form 


f(z) = {exp (uaz) } exp ga(z) 


M 

+L! exp + (um 
m=( 

where the prime on the summation sign indicates that the term for m=a is 

to be omitted. It is easy to show, from the geometry of the polygon P, that 

in the sufficiently distant part of 2. we have 


M 
fm(2) exp [gm(z) + (um — exp [— Blz/*], 
m=0 
where B is a positive constant. It follows from the last relation, and from (7), 
that in the sufficiently distant part of T°) we have a relation of the form (5). 
This completes the first of the two main parts of the proof of the lemma. 
In the second part of the proof it is convenient to employ a new assign- 
ment of the subscripts. Choose an arbitrary side of the polygon P, and denote 
it by the symbol L. We assign subscripts so that wo,---, ua are the p’s 
that lie on L, uo and wy being the clockwise and counter-clockwise extremi- 
ties of L, respectively. 
We write the function f(z) in the form 


(8) f(z) = [A(z) + k(z)] exp [4(uo + J, 


where* 


* Of course, it may happen that M’’=M. In this event the function &(z) does not appear, and 
the next few steps in the proof are simply to be omitted. 


— 


ZEROS OF SUMS OF EXPONENTIALS 


m=( 


(9) 
k(2)= >> fm(z) exp ES + |. 


m=M'’+1 


From any point on Z draw a normal exterior to P, and let ¢(0<¢<27) 
denote the angle between the positive real axis and this outward-drawn nor- 
mal. It is clear that if m>M’’, and if amp [umn—3(uo+um-) | is suitably de- 
fined, 


1 
+05 amp| —— | 6, 


where @ is a certain number that satisfies the relations 0<o¢<7/2. 
Let ¢€ be a positive number less than o. Consider any one of the sectors 
U; that are defined by the relations 
Us: — +6)/N S ampz S (— ¢+ 6 — 26x —)/N, 
B=0,---,N-—1. 


It is easy to show that if z is in Us, and if | z| is sufficiently large, we have a re- 
lation of the form 


| k(z)| < exp [— B|z|*], Ba positive constant. 
We now write the function h(z) in the form 


h(z) = fo(z) exp [go(z) + — | 


Sm(2) 

exp [gm(2) — go(2) + (um — ; 
mnt fi o(2) 

and, in order to simplify the formal side of the exposition slightly, we assume, 

for the moment, that the primary critical ray 


amp z = — (¢ + 26r)/N 


is not a critical ray of the function fo(z). 

By the hypothesis for the induction, if the angular opening of the sector 
Us is taken sufficiently small, if z is in Us, and if |z| is sufficiently large, we 
have a relation of the form 


|fo(z) | = exp [— Bo|z|%*], Bo a positive constant. 
There exist positive numbers, C,, and D,,, such that, for all values of z, 


\fm(z) exp [gm(z) — go(z)]| exp [Cm|z|¥-! + Dn]. 


| 

bat 

1934] 347 

4 

| 

| 


348 L. A. MacCOLL [April 


It follows, therefore, that if z is in Us, and |2| is sufficiently large, we have 
the following relation,* for m=1,---,M”’: 
m(2) 
exp [gm(2) — go(z) + (um — ] 
fo(z) 


S exp {Bo|z|¥* + + Dn +R[(um — }. 
Let us write 
z= im — fo = | m= 1, M", 


where ¢ is real and non-negative, and R,, is positive. Let g be a real number 
such that M’’et<1. We now write the M”’ relations 


(10) cos (NO + + 4/2) + Bor”? + Car"! + Dn Sq, 


and proceed to investigate the several regions defined by them. 
The boundary of the region defined by the typical relation (10) is repre- 
sented by the equation 


1 
(11) N6 + ¢ = arc sin + Car + (Da — 
For our purposes it is convenient to write this last relation, for large values of 
r, in the form 


C1 ce 
(12) — (— — jr)/N 


where j is any integer, and the c’s are constants. Equation (12) represents a 
curve which is asymptotic to a line parallel to the ray @= —(¢+j7)/N, or to 
the ray itself. 

Giving 7 successively the values 0, - - - , 2V—1, we get, from (12), the 
representations of the 2N branches of the curve represented by (11). The 
branch corresponding to the even value 28 of j is asymptotic to a parallel to 
the bisector of Us, or to the bisector itself. We are not directly concerned with 
the branches that correspond to odd values of 7. 

In the sufficiently distant part of the sector Us there are M’’ branches 
of curves of the type just described, corresponding to the several values 
1,---,M” of m. 

As Ri, ---, Ra are all positive, the region (10) lies on the counter- 
clockwise sides of the branches of the boundary that correspond to even val- 
ues of j. The region lies on the clockwise sides of the branches that correspond 
to odd values of 7. 


* If wand vare real, and w=u-+iv, we write u= Rw, v= Sw. 


1934] ZEROS OF SUMS OF EXPONENTIALS 349 


It follows from the above that we can draw a line parallel to the bisector 
of Uz, such that in the distant part of the sub-sector bounded by this line 
and the counter-clockwise side of U, the function 


Sm(2) 
1 
So(z) 


is bounded away from zero.* 
Now consider the function 


(13) fo(z) exp [go(z) + 3(uo — 


If zis in Ug, and if |z| is sufficiently large, the modulus of this function is not 
less than 


exp [gm(2) — go(z) + (um — ] 


exp {— + R[3(u0 — J}, 


where By is a suitably chosen positive constant. By reasoning similar to that 
used just above, we show that in U; there is a sub-sector,* bounded by a line 
parallel to the bisector of Ug and by the counter-clockwise side of Us, in the 
distant part of which the function (13) is bounded away from zero. 

It follows that in Us there is a sub-sector,* bounded by a line parallel to 
the bisector of U, and by the counter-clockwise side of U,, in the distant part 
of which the function h(z) is bounded away from zero. 

Assuming provisionally that the ray amp z= —(¢+287)/N is not a criti- 
cal ray of fw-(z), we show, in an entirely similar way, that in Us there is a 
sub-sector,* bounded by a parallel to the bisector of Ug and by the clockwise 
side of Ug, in the distant part of which /(z) is bounded away from zero. 

It is easy to show that we have the results just stated even when the ray 
amp z= —(¢+287)/N is a critical ray of one or both of the functions f(z), 
fu:-(z). By the hypothesis for the induction, the sector then contains a half- 
strip, extending in the direction of the bisector of Ug, such that in the distant 
parts of the regions within the sector and outside the half-strip, we have re- 
lations of the form 


ifo(z)| = exp [— Blz|¥*], | fur--(z)| = exp [— Blz|¥™”], 


where B is a positive constant. Now we have only to confine our attention to 
values of z which lie in one or the other of the regions just described. With this 
understanding about the values of z under discussion, the preceding reasoning 
is valid without any essential change, and we are led again to the result stated 
in the preceding paragraph. 

It follows from (8), and what has been proved concerning the functions 


* The sub-sector is taken as including the points on its boundary. 


| 

| 
| 

ia 

eu 

aa 

i 


350 L. A. MacCOLL [April 


h(z) and k(z), that in Us there are two sub-sectors,* each bounded by a line 
parallel to the bisector of Us and by a different one of the sides of the sector, 
such that in the sufficiently distant parts of these sub-sectors we have a rela- 
tion of the form (5). 

This completes the proof of the lemma. 


5. DISTRIBUTION OF THE ZEROS WITHIN A CRITICAL HALF-STRIP 


We shall call the half-strips determined in the preceding paragraphs, 
within which the zeros of f(z) must lie, critical half-strips. It is natural to 
classify the several critical half-strips as primary, secondary, tertiary, etc., 
according to the classification of the several critical rays to which they corre- 
spond. 

It has not yet been shown that f(z) has any zeros at all. We shall now 
prove that zeros do actually exist, and we shall obtain asymptotic formulas 
giving the distributions of the zeros in the various critical half-strips. Specifi- 
cally, the remainder of the paper will be devoted to proving the following 
two theorems. 


THEOREM 2. The number of zeros of f(z) in the interior of a rectangle bounded 
by segments of length r of the sides of a primary critical half-strip, by the end of 
the half-strip, and by a segment congruent to the end, is equal, for r large, to 


[1 + O(1/r)], 


where | is the length of the side of the primary critical polygon that correspondst 
to the half-strip. 


THEOREM 3. Let Z;(r) and Zfa...r4(r) denote, respectively, the numbers of 
zeros of f(z) and fa...r,(%) in the interior of a rectangle bounded (1) by segments 
of length r of the sides of a critical half-strip which is primary for fa...r(2) and 
non-primary for f(z), fa(Z),-- (2) by the end of the half-strip; 
(3) by a segment congruent to the end. Then, if r is large, we have 


= Zy,....,(7) [1 + O(1/r)]. 


* See footnote on p. 349. 

t Each side of the polygon determines, through the direction of the outward-drawn normal, a 
set of primary critical rays, and each primary critical half-strip extends in the direction of one of these 
rays. This is the correspondence referred to. 

t It is to be observed that a non-primary critical ray for f(z) is a critical ray for a definite func- 
tion f(z); if it is a non-primary critical ray for f(z), it is a critical ray for a definite function fag(z); 
and so on. The ray is a primary critical ray for a definite function fq... ,y(z). Also, it is to be observed 
that the critical half-strip for f(z), corresponding to the ray, can be considered as the critical half- 
strip for each of the functions fa(z), fag(z), - - - , fa---dy(z), corresponding to the ray. 


1934] ZEROS OF SUMS OF EXPONENTIALS 351 


We shall prove these theorems by a process of induction. Our reasoning 
proves Theorem 2 directly for the case in which VW =1. We shall show that 
if N >1, the theorems are consequences of the corresponding theorems con- 
cerning Z-functions having exponents less than N.* 


5.1. Proor oF THEOREM 2 


We here employ the assignment of subscripts and the notation that were 
used in the latter part of §4.1. 

Consider a particular primary critical half-strip, say the one that extends 
in the direction of the bisector of the sector Ug. 

Consider a rectangle, R, the vertices of which, in counter-clockwise order, 
are denoted by Vi, V2, Vs, V4, respectively. The rectangle is taken so that (1) 
ViV2 is a segment of the clockwise side of the half-strip; (2) V;V, is a segment 
of the counter-clockwise side of the half-strip; (3) f(z) does not vanish on 
either of the segments V2V3, V,V:. We take the half-strips so that f(z) does 
not vanish on either side of any one of them; hence f(z) does not vanish on 
the boundary of R. Let the complex number corresponding to the typical 
vertex V; of R be 2;. 

The number, Z;(R), of zeros of f(z) contained in R is given by the formula 

2xZ,(R) =variation of amp f(z) along the curve ViV2V3V4Vi. 
For the sake of brevity, we use a self-explanatory notation in which this for- 
mula becomes 


2nZ,(R) = [v.a. f(z), ViV2VsVV 1 
(14) = [v.a. f(z), ViV2] + [v.a. f(z), VoVs] 
+ [v.a. f(z), VsVa] + [v.a. f(z), ViVi]. 
We shall estimate the value of each of the terms in the right-hand member of 


(14), assuming that R lies in the distant part of the half-strip. 
First consider the term [v.a. f(z), ViV2]. We write f(z) in the form 


M''-1 
= {fure(s)} {exp + {1 +> 
h(z) 


The functions h(z), k(z) are defined by equations (9). We now have the fol- 
lowing relation: 


exp [gm(z) 


=> FiF2F3F 4. 


gu(s) + im — + 


* The widths of the half-strips, and the locations of the ends of the half-strips are somewhat arbi- 
trary. The formulas stated in the theorems imply, of course, that the half-strips have been definitely 
chosen. 


t 

a 

| 


L. A. MacCOLL 


[v.a. f(z), ViV2] = > [v.a. Fi, ViVe]. 


It has been shown in §4.1 that | F;—1| <1,and that | 7,—1| <1, when z 
is on the distant part of the clockwise side of the half-strip. Therefore, if R 
is sufficiently distant, 
[v.a. F3, ViV2] 


[v.a. Fy, ViV2] < 


We have immediately 
[v.a. ViV2] = + | 3 + |. 
It remains to consider the term [v.a. f(z), ViV2]. If fu--(z) is constant, 


this term is zero. Henceforth assume that the function is non-constant. The 
function is of the form* 


J 
(15) fur(z) = exp (ivy. + + ajo). 
7=0 
Let z’ be the point collinear with 2; and z that is nearest the origin; let 
z’’ be a particular one of the two points that are collinear with 2, and z, and 
are such that |2’—z’‘| =1. We write 


+ (2" — 


thereby changing the independent variable from z to ¢. We write 


J 
fy++(z) = exp (Digg + + bio), 
j=0 
where the b’s are functions of 2’, and the a’s of (15). Let bj, 
where and are real. 
We now consider the functions 
J 


iN M 


J 
When the points z, 2, z% are collinear, A(#) and iB(t) are, respectively, the 
real and imaginary parts of $(t) =/y--(z). Consequently, for such 2’s we have 
the relation 


* It is not implied that the J here is the same as the J in equation (1). The same remark applies 
in several other places in the following work. 


352 [April 
? 


1934] ZEROS OF SUMS OF EXPONENTIALS 


n amp fy,.(2) = tan amp ¢ 40 


Now fy-(z) 0 on the distant part of the clockwise side of the half-strip. 
Hence, A(#) and B(¢) cannot vanish simultaneously on the part of the real 
axis in the ¢-plane that corresponds to the distant part of this side of the half- 
strip. If either A(é) or B(#) is identically zero, amp fy--(z) is constant on the 
segment V,V2. Henceforth let us assume that neither A(é) nor B(é) is iden- 
tically zero. It follows from (i6), and the properties of the tangent function, 
that amp fy(z) cannot vary by as much as z on any distant segment of the 
side of the half-strip without A (¢) vanishing on the corresponding segment of 
the real axis in the ¢-plane. Therefore, if we can show that A(#) has not more 
than, say, v zeros on the segment in the ¢/plane that corresponds to the seg- 
ment V,V2, it will follow that 


— (v+1)x < [v.a. fy-(2), ViV2] < 


Thus the problem of estimating the value of [v.a. f(z), ViV2] is related to 
the problem of estimating the number of zeros of A(¢) on a certain distant 
segment on the real axis in the /-plane. 

If A(é) is the exponential of a polynomial, it has no zeros. Henceforth 
assume that A (#) is not the exponential of a polynomial. Then A (#) is a func- 
tion of the same type as the function f(z) which is the subject of this entire 
study, except for the essential difference that, instead of the exponent NV, we 
have here the smaller exponent Ny. Therefore, the zeros of A(é), if there 
are any, are confined to certain half-strips in the /-plane. We are interested 
in those zeros, if there are any, which lie on a certain segment on the distant 
part of a particular half of the real axis. If no one of the critical half-strips 
for A(é) contains this half of the real axis, there are no zeros on the segment in 
question (provided the rectangle R is taken sufficiently distant in the 
z-plane). If, on the other hand, the above-mentioned half of the real axis in 
the ¢-plane is contained in some critical half-strip for A(é), we draw a rec- 
tangle, R’, in the latter half-strip (the rectangle having two of its sides on 
the sides of the half-strip, containing in its interior the segment corresponding 
to ViV2, and being taken so that A(#) does not vanish on the boundary of 
R’), and we estimate the number of zeros of A(#) in R’ by means of the the- 
orems concerning A (#) corresponding to our Theorems 2 and 3. The number 
of zeros on the segment corresponding to Vi V2 does not exceed the number of 
zeros in R’. By the hypothesis for the induction, we have the following ex- 
pression for the number of zeros of A(¢) contained in the rectangle R’: 


353 

| 


354 L. A. MacCOLL [April 


where a is a positive constant, is a positive integer not greater than Vy», 
and & are the values of ¢ that correspond to 2 and 2, respectively, and &+71’’ 
and #,—7’ are the values of ¢ corresponding to the points at which the real 
t-axis intersects the boundary of R’; |r’| and |7’’| may be taken arbitrarily 
small, but not zero. 

We have now completed the estimation of the value of the term 
[v.a. f(z), ViV2]. It is evident that the term [v.a. f(z), V3V4] can be discussed 
in an altogether similar manner. No details of this discussion need be given 
here. 

Next we shall estimate the value of the term 


[v.a. f(z), = 3[3 (uo + — (uo + | 
+ [v.a. (h(z) + k(z)), VeVs]. 


Let 2 be the point at which the bisector of the sector Us, intersects the 
straight line through V2 and V3. We write 


23 — 22 = 2 = 20 + ft, 

thus changing the independent variable from z to ¢. We also write 

The function (#)+w(é) is of the form 


J 
¥(t) + w(t) = > exp +--+ + 
j=0 


where the y’s are constants which depend on 20, ¢, and the X’s in (1). Let 
Vin=YVin tivin’, where and yjn’ are real. 
We now consider the functions 


J 
Ge) = (+--+ cos be"), 


J 
H(i) = exp + + sin + + 
If z, 2, and 2; are collinear, G(¢) and zH(#) are, respectively, the, real and 


imaginary parts of the function ¥(#) +w(é). 
For we have 


Hm — + ur) = pm exp [i(¢ + 2/2)], 


‘ 


1934] ZEROS OF SUMS OF EXPONENTIALS 


where p» is real. Also, we have 
zo = |zo| exp [— i(¢ + 26x)/N]. 
Hence 


+ 
(17) ( = ipm|zo|¥ + exp + 


Let T be any positive constant. The relation (17) shows that when 
|t| <7, and when |20| is sufficiently large, we have a relation of the form 
exp [Bi |z0|¥-*], 


where B, is a positive constant. We know that in the distant part of Us we 


have 
|k(z)| < exp [— Ba|z|*], 


where Bz is a positive constant; hence, when |#| <7, and when | zo| is suffi- 
ciently large, we have 


jo(t)| < exp [— Bs |zo|*], 


where B; is a positive constant. It follows that when |#| <7, and when |z0| 
is sufficiently large, we have relations of the form 


(18) |G(t)| exp [B]z0|¥-"], |H(t)| < exp [B]z0|¥-], 


where B is a positive constant. 
When z is collinear with z. and 23 we have the relation 


Ht 
tan amp [h(z) + k(z)] = —— - 


Because of the way in which the rectangle R has been taken, G(#) and H(t) 
do not vanish simultaneously at any point on the segment corresponding to 
V2V3. If either of the functions G(i), H(#) is identically zero, amp [h(z) +(z) ] 
is constant along V2V;. Henceforth assume that neither G(#) nor H(é) is 
identically zero. Then, if G(é) vanishes not more than, say, v times on the 
segment in the ¢-plane that corresponds to V2V3, the following relation holds: 


— (v+1)4 < [v.a. (h(z) + R(z)), VaVs] < 


We have the same relation if H(¢) does not vanish more than v times on the 
segment corresponding to V2V3. 

We know that h(z) is bounded away from zero when z is on the distant 
part of the clockwise side of the half-strip. The same is, therefore, true of the 
function h(z)+k(z). Hence, if | zo| is sufficiently large, and if # is the value of 


355 
Fi 

| 

| 
| 


356 L. A. MacCOLL [April 


¢ that corresponds to the value 2: of z, one at least of the numbers |G(é)|, 
| 1 (t)| is greater than a certain fixed positive number 6. To fix the ideas, 
suppose that |G(t)|>6; similar reasoning applies if |G(é)| <6 and 
| H(t2)| >6. 

We wish to establish an upper bound for the number of zeros of G(#) on 
the segment corresponding to V2V;. For this purpose we employ a result due 
to Jensen, which we state for our purposes in the following restricted form*: 

Let G(t) be an integral function, such that 


|G(t)| M(r) for Sr. 
Let G(t) vanish at the points t,t, -- - , t, such that 
Os <r 
Then 
(19) — (4 — te) (4 — => 


Choose a positive number 7 large enough so that the segment in the 
t-plane corresponding to V2V; is contained in the interior of the circle 
|t—t| =r. The value of r may be, and is, taken to be independent of | zo]. 
Denote by 4 the value of ¢ corresponding to 2. We are interested in those 
zeros of G(#), if there are any, that are within or on the circle |t—#| =|4—A| . 
We take the ¢’s of the above theorem to be just these zeros. We then have, 
by (18) and (19), 


t3 —t = 
~  M(r) [B |zo|¥-*] 


or 


T 


|ts — ta| 


(20) 
log 


the logarithms being real. This completes our estimation of the value of term 
[v.a. f(z), V2V3]. It is to be noted that 7 is a constant greater than | t;—#| , so 
the only variable in the right-hand member of (20) is | zo] . 

The term [v.a. f(z), V4Vi] can be discussed in a way that is entirely simi- 
lar to the way in which we have discussed [v.a. f(z), V2V3]. No details of this 
discussion need be given here. 

To complete the proof of Theorem 2 we write out the complete expression 
for the number of zeros contained in the rectangle R, using the estimates we 


* Bieberbach, Lehrbuch der Funktionentheorie, vol. II, p. 109. 


= 


1934] ZEROS OF SUMS OF EXPONENTIALS 357 


have found for the several terms in (14). We shall consider the side V.V; of 
R as fixed, and the side V2V; as variable; in particular we shall confine our 
attention to cases in which the sides V, V2 and V3;V, are long. Collecting our 
results, we see that the number of zeros in R is given by the equation 


= (uy. Mo) + | + go(zs) | 
+ [v.a. ViV2] + [v.a. fo(z), VsVa] + [v.a. (A(z) + + W, 


where W is a quantity that is less, in absolute value, than a fixed number. 
Now, in terms of the notation used before, 


— wo = — ho | exp [i(@ + x/2)], 
= (zo + = |zo|¥ exp (— i¢) +---, 
= (29 + fts)¥ = |zo|¥ exp (— i¢) +---, 
and hence 
+ terms in lower powers of |zo|. 


We have previously seen that if | zo| is sufficiently large we have relations 
of the forms 


| [v.a. ViV2] | S 
|[v.a. fo(z), VaVa}| 
| [v.a.(a(z) + k(z)), VeVs]| 


where the a’s are positive constants. Also, if |zo| is large, 


— go(zs) | | org |zo 


where a, is a positive constant. 
It follows from the relations we have obtained that when |20| is sufti- 
ciently large we have 
Ho | |z0 | 


N 
(21) Z,(R) = = [1 + O(1/|zo|)]. 


Equation (21) contains our fundamental result; from it Theorem 2 follows 
at once. 
5.2. PRooF OF THEOREM 3 


We revert to the notation used in §3. ‘ 
Suppose that the function f,(z) has a critical ray lying in one of the sectors 
Si, We write 


; 

j 


L. A. MacCOLL 


Sm(z) 
=1+ >’ 
m=o0 fa(2) 
the prime on the summation sign indicating that the term for m=a is to be 
omitted. The zeros of f(z) are the same as those of the function f.(z)F.(z). 
Our reasoning depends essentially on the following two theorems from the 
theory of integral functions*: 
1. Let the integral function F(z) be of finite order p. Let 2, 2, 23, - - - de- 
note the non-zero zeros of F(z), and let h be any real number greater than p. Then 
the series 


exp [gm(z) Sa(2) + (um J, 


converges. 

2. Let the integral function F(z) be of finite order p. Let h be any positive 
number greater than p, and let ¢ be any positive number. About each of the non- 
zero zeros, 2;, of F(z) as center describe a circle T; of radius |2;|~*. Then if the 
point z is outside each of the circles T;, and if |z| is sufficiently large, we have 
the relation 

| F(@)| > exp [— 

It is to be noted that the first theorem insures that z2’s such as are referred 
to in the second theorem exist. 

Obviously, the order of f.(z) is not greater than N.. Let hk be a positive 
number greater than N,. Let ¢ be a positive number less than unity. By the 
second theorem cited above, if about each non-zero zero 2; of fa(z) as center 
we describe a circle I’; of radius |z,;|~*, and if we take z outside of each of 
these circles, and such that |z| is sufficiently large, we have 


[fe(z)| > exp [— 


It follows readily from considerations similar to those used in the proof of 
Theorem 1 that if z is in the sufficiently distant part of a certain sector S that 
encloses the critical ray under consideration, we have a relation of the form 


| fa(z)| - |Fa(z) — 1| exp [— 


where B is a positive constant. 

Let k be any positive number less than unity. It is clear, from the fore- 
going, that we can find a positive number K such that if z is in the sector S, 
is outside of each of the circles I’;, and is such that |z| >K, we have 


(22) |Fa(z) -1] Sk <1. 


* Bieberbach, Lehrbuch der Funktionentheorie, vol. I1, p. 243 and p. 268. 


358 [April 


1934] ZEROS OF SUMS OF EXPONENTIALS 359 


We denote the set of all such points z by the symbol Q. 

Now consider any simple closed regular curve I that is composed entirely 
of points of the set 2, and which does not pass through any zero of f(z) or of 
fa(z). The number, Z,(T), of zeros of f(z) within I is given by the formula 


1 1 
(23) = fa(z),T] + 5, F.(z), 


The first term in the right-hand member of (23) is the number of zeros of 
fa(z) within I’. By (22), we have 


F.(2),T] <— 
—— <—Iv.a. F.(z), 
2 2r 2 


Hence the number of zeros of f(z) within I is equal to the number of zeros of 
fa(z) in the same region. This is our fundamental result. 

Consider a rectangle bounded (1) by segments of length r of the sides of 
the half-strip corresponding to the critical ray under consideration; (2) by 
the end of the half-strip; (3) by a segment congruent to the end. Let V;, V2, 
V3, V4 denote the vertices of the rectangle in counter-clockwise order, the 
segment V,V; being the end of the half-strip. We consider three other rec- 
tangles, ViV{9 VSO V4, VSP Vy, and The segments 
VS, VS, VeVs, Vo? VS) are assumed to be crossed in that order 
as we proceed outward along the half-strip. The segments , VO 
VS» Vs, VL are assumed to consist entirely of points of the 
set Furthermore, the boundaries of the rectangles V{ VX , 
Vio VS) Vs) V{ are assumed not to pass through any zeros of f(z). Let the 
symbols R, Ro, Ri, Re, respectively, denote the rectangles in the order in 
which they have been named. Let Rj, Ry denote the rectangles 
VO VO VO VS Ve) VS V\ , respectively. 

Let Z;(R) denote the number of zeros of f(z) within R. Similar symbols 
will be used in similar senses without further explanation. 

Now obviously, 


Z,(R) < ) Z(Ro) 
Zja(Ro) + Zj(R2)  Zy(Ro) +Z;(Ri) Zs,(Ro) + 
It has been shown that 
ZARj) j= 1,2. 


* It is a simple consequence of the first theorem cited above, and of our other results, that this 
condition can be satisfied. 


ig 
18 
is 


360 L. A. MacCOLL 


By the hypothesis for the induction, we have, for 7 =1, 2, 


ar;* ar” 
Z;,(R}) = + O(1/r;)] + O(1/r0)] 


where a is a positive constant, m is a positive integer not greater than N,, 
and fo, 71, 72 are the lengths of the segments V,V{ , ViV2™, Vi V2, respec- 
tively. 

We hold 7» fixed. By the first theorem cited above, if we make the segment 
ViV2 sufficiently long, we can make r2—1 arbitrarily small without violating 
any of our previous stipulations. Now an easy and obvious calculation gives 
us the result stated in Theorem 3. 


BELL TELEPHONE LABORATORIES, 
New York, N. Y. 


\ 


IDEAL THEORY AND ALGEBRAIC 
DIFFERENTIAL EQUATIONS* 


BY 
H. W. RAUDENBUSH, Jr. 


INTRODUCTION 


J. F. Rittf introduced the idea of irreducible system of algebraic differ- 
ential equations and showed that every system of such equations is equiva- 
lent to a finite set of irreducible systems. 

One of the objects of this paper is to develop a special type of abstract 
ideal theory which has Ritt’s theorem as a consequence. The elements of our 
ideals are polynomials in unknowns 9, - - - , y¥, and a certain number of their 
derivatives. Following Ritt, we call these polynomials forms. The coefficients 
in these forms are assumed to be elements of a differential field $ of charac- 
teristic zero.f A differential field is a commutative field (as in abstract algebra) 
whose elements a, b, - - - have unique derivatives a, b:, - - - which are ele- 
ments of the field. These derivatives must satisfy the rules (¢+6);=ai:+h: 
and (ab),:=a,b+ab;.§ The totality of these forms with coefficients in Fis a 
differential ring R.|| We consider differential ideals, which are ideals contain- 


ing together with any element its derivative. An example given by Ritt 
shows that there exists a differential ideal of R having no finite subset, such 
that every element of the ideal is a linear combination of elements of the 
subset and their derivatives with forms of R as coefficients.** 

Certain results of Ritt suggested that we consider, as our purpose permits, 
only differential ideals which have the property that if they contain an ele- 


* Presented to the Society, October 28, 1933, and December 27, 1933; received by the editors 
November 24, 1933. : 

tJ. F. Ritt, Differential Equations from the Algebraic Standpoint, Colloquium Publications of 
this Society, vol. 14. Cf. p. 14. 

t For definitions of terms of abstract algebra see B. L. van der Waerden, Moderne Algebra. 

§ Abstract differential fields have been treated by R. Baer, Algebraische Theorie der differentier- 
baren Funktionenkorper, I, Heidelberger Akademie der Wissenschaften, Sitzungsberichte, Mathe- 
matische-Naturwissenschaftliche Klasse, 1927-1928, and by the author, Differential fields and ideals 
of differential forms, Annals of Mathematics, vol. 34 (1933), pp. 509-517. They have been used by 
O. Ore, Formale Theorie der linearen Differentialgleichungen, Journal fiir Mathematik, vol. 167 (1932), 
pp. 221-234, and vol. 168 (1932), pp. 233-252. 

|| Raudenbush, loc. cit., p. 514. In the definition of differential field, substitute ring for field to 
obtain the definition of differential ring. 

J] Raudenbush, loc. cit., p. 516. 

** Ritt, loc. cit., p. 12. 


in 
ig 
4 

q 

361 


362 H. W. RAUDENBUSH [April 


ment a of R, they contain any element } of R such that a positive power of 
bis a. We call these differential ideals perfect differential ideals. We show that 
every perfect differential ideal of R is the intersection of a finite number of prime 
perfect differential ideals. 

The use of perfect differential ideals was suggested by the following two 
results of Ritt: 

(a) Every infinite system of forms has a finite subsystem whose manifold of 
solutions is identical with that of the infinite system.* 

(b) Let Fi, -- - , F,; G be forms such that G has every solution of the system 
F,,- ++, F,. Then some power of G is a linear combination of the F; and a cer- 
tain number of their derivatives with forms for coefficients. 

We obtain abstract theorems that specialize to a combination of these 
results of Ritt. For instance, we show that every perfect differential ideal of R 
has a finite subset such that every form of the ideal has a power which is a linear 
combination of the forms of the subset and their derivatives with forms of R for 
coefficients. The proof of this basis theorem is like the proof of Ritt’s result 
(a) in fundamental respects, but there are essential differences. We also ob- 
tain an abstract generalization of Ritt’s result (b). The conciseness of the 
proof of this theorem is an indication of the simplicity of our theory. 

Having established the basis theorem, the development of our ideal theory 
follows approximately the well known methods of E. Noether.{ 


PERFECT DIFFERENTIAL IDEALS 


1. We consider a fixed differential ring R of characteristic zero. 

The intersection of any arbitrary set of differential ideals is a differential 
ideal. For let a be any element of the intersection. Then a is an element of 
every ideal of the set; hence the derivative a is in the intersection. The inter- 
section, which is known to be an ideal, is then a differential ideal. The inter- 
section of any arbitrary set of perfect differential ideals is a perfect differ- 
ential ideal. Let a and 6 be elements of R such that a is in the intersection 
and some power of b is a. Then a is in every ideal of the set, hence also b. 
Therefore the intersection is a perfect differential ideal. 

Let o be an arbitrary set of elements of R. We notice that R is a perfect 
differential ideal. The intersection of the differential ideals containing ¢ will 
be called the differential ideal [a] determined by o. [co] is uniquely defined. 
The intersection of all perfect differential ideals containing o we call the 
perfect differential ideal {a} determined by o. {a} is uniquely defined. 


* Ritt, loc. cit., p. 10. 
t Ritt, loc. cit., p. 108. 
t E. Noether, Idealtheorie in Ringbereichen, Mathematische Annalen, vol. 86 (1921), pp. 24-66. 


1934] IDEAL THEORY AND DIFFERENTIAL EQUATIONS 363 


Let a be any set of elements of R. We shall denote by a’ the set consisting 
of all elements of R which have a positive integral power in a. Using the set o 
of the preceding paragraph, we define a, recursively as follows: 
a1 = [co], 
on = [o,/-1] (mn = 2,3,4,---). 
Let 6 denote the totality of elements of the sets o,. Then 8 is a perfect differ- 


ential ideal and is contained in {a}, hence is {o}. This means that any ele- 
ment ¢ of {a} is in some, with a sufficiently large subscript. 


Lemma. If a differential ideal 5 contains a positive integral power a” of an 
element a it contains the positive integral power a;*?—" of the derivative a, of a. 


5 contains (a”);= pa—'a, hence 6 contains a?~'a,.* Assume that 6 con- 
tains a?~"a,;’, where r< p. Then 6 contains 


ai(a” a3); sa2(a a) =(p-— ra” , 


where d2=(a;):; hence 6 contains a?-"-'a,'+*. Applying this result p—1 times 
to a?—'a, we find that 6 contains a,??-'. 

Let ¢ be any element of {c} not in o:. There is a least positive integer 
n>1 such that o, contains ¢. As an element of ¢,, ¢ is equal to a linear homo- 
geneous expression in a finite number of elements of o,_; and a finite number 
of derivatives of elements of ¢,/_, with elements of the ring or integers for 
coefficients. But, by the lemma, each of these elements has a power in oy-1. 
Let r be their number and s the maximum of the powers to which each must 
be raised to give an element of o,_:. Then /”*~*+ is in ¢,_1, for each term of the 
same power of the linear expression contains an sth power of some one of the 
elements and hence each term is in ¢_1. 

This power of ¢ by the same reasoning has a power in gn_2. Hence ¢ has 
a power in g,_2. Continuing this process a finite number of steps gives the 


THEOREM 1. If t is any element of a perfect differential ideal {a} of a differ- 
ential ring R of characteristic zero determined by a set o, then some positive 
integral power of tis in the differential ideal [a | determined by o.t 


2.§ Lemma. If a perfect differential ideal 3 contains the product ab of any 
two elements a and b then it contains the product a,b, of any derivatives of a 
and b.|| 


* If R were of characteristic p we could not draw this conclusion. 

+ If m is an integer and a an element of the ring, la=a, —1a=—a, na=(n—1)a+a, na=an. 

t The theorem is not true for rings of non-zero characteristic. 

§ The results of this and the next article are independent of Theorem 1 and true for non-zero 
characteristic. 

|| may be zero; a9=a and we shall speak of the zero derivative of a. ap=(ap—1). 


4a 


364 H. W. RAUDENBUSH 


Assume that a,,), is in 7. Then z contains 


Hence, by the definition of a perfect differential ideal, r contains @m41),. 
Similarly, 7 contains @,5,4:. Since, by hypothesis, r contains dobo, the lemma 
is obtained by induction. 

THEOREM 2. The intersection {a,a} A {a,b} of the perfect differential ideals 
determined by the sets obtained by adjoining elements a and b, respectively, to 
the set o of elements is the perfect differential ideal {a, ab} determined by the set 
obtained by adjoining the product ab to o.* 


Every element of {c, 2b} is in the intersection. We have only to show that 
any element ¢ of the intersection is in {¢, ab}. 

By Theorem 1 some power of #, say ?/*, is in [c, 2], and some power, say 
t*, is in [o, b]. Hence i+ is in {o, ab} since each term of the product of the 
linear expression for /’, in terms of the elements of o and a and their deriva- 
tives, and for /*, in terms of the elements of o and 3, contains either elements 
of o or a product a,b, of derivatives of a and b. By the definition of perfect 
differential ideal, ab} contains ¢. 


DECOMPOSITION OF PERFECT DIFFERENTIAL IDEALS 


3. We shall say that a perfect differential ideal + which is determined 
by a set o has o as a basis. If every perfect differential ideal of a differential 
ring has a finite basis, we say that R is a differential ring with a basis theorem. 


THEOREM 3. Let 


™ Sm 


be an infinite sequence of perfect differential ideals of a differential ring with 
a basis theorem such that each ideal contains its predecessor in the sequence. 
There exists an integer n such that 


Tn = = 


Let x be the totality of elements in the ideals of the sequence. Let a be 
any element of 7. Then a is contained in some ideal of the sequence with a 
sufficiently high subscript. Therefore 7 contains a; or any element 6 having 
a power equal to a and hence is a perfect differential ideal. 7 has a finite basis 


* A more general theorem could be proved but this is sufficient to our purpose. 
t Cf. van der Waerden, loc. cit., vol. II, p. 25. 


[April 
<7, <--- 


1934] IDEAL THEORY AND DIFFERENTIAL EQUATIONS 365 


which must be contained in some ideal of the sequence with a sufficiently 
large subscript But 7,=7, hence 

A perfect differential ideal x will be called vndeeellile if there exist perfect 
differential ideals a and 6 such that z is a proper subset of a and of 8 and 
is their intersection a A 8. If a perfect differential ideal is not reducible, it is 
said to be irreducible.* 


THEOREM 4. A perfect differential ideal which is irreducible is prime.t 


We show that a perfect differential ideal which is not prime is reducible. 
Let z be a perfect differential ideal which is not prime. There exist two ele- 
ments a and b such that contains ab but neither a nor b. Form the perfect 
differential ideals {, a} and {z, b}. Each contains x as a proper subset. 
Their intersection {z, a} A {x, b} by §2 is {x, ab} but since ad is in x, the 
intersection is 7. Hence z is reducible. 


THEOREM 5. In a differential ring with a basis theorem, any perfect differ- 
ential ideal is the intersection of a finite set of irreducible or prime perfect differ- 
ential ideals. 


We suppose that the theorem is not true. Then there exists a perfect 
differential ideal x which is not the intersection of a finite number of irreduci- 
ble perfect differential ideals. r must be reducible. Hence 7 is the intersec- 
tion of two perfect differential ideals a and 6 each containing 7 as a proper 
subset. At least one of the perfect differential ideals a and 8 is not the inter- 
section of a finite number of irreducible perfect differential ideals. Let 1 de- 
note this perfect differential ideal. By the same reasoning m is a proper sub- 
set of a perfect differential ideal zz which is not the intersection of a finite 
number of irreducible perfect differential ideals. Continuing in this manner 
we obtain an infinite sequence of perfect differential ideals, each containing 
its predecessor as a proper subset. This contradiction of Theorem 3 proves 
the theorem. 

In such a finite set of irreducible or prime perfect differential ideals, we 
may delete in turn all ideals which contain other ideals of the set. The re- 
maining set will be called an essential set. 


THEOREM 6. If a perfect differential ideal x is the intersection of each of 
two essential sets a1,+ ++, and B;,---, B,, thenr=s and the a’s coincide 
with the B’s after a suitable rearrangement. 


* This use of the word “irreducible” is analogous to its use in algebra. Cf. van der Waerden, 
loc. cit., vol. H, p. 36. 

ft An ideal is prime if it contains together with the product of any two elements at least one of 
the elements. 


366 H. W. RAUDENBUSH  ° [April 


a, is contained in some §;. If not, each 6; would contain an element 5; 
not in a. By a repeated application of Theorem 2, 7 contains the product 
b, - - - b,. Hence a; contains this product, which contradicts the fact that a; 
is prime. 

We may suppose that a; is contained in #; after a suitable rearrangement 
of the 8’s. 8; is contained in some a which must be ai. For if 8; were contained 
in a, k#1, then a: would be contained in a contradicting the assumption 
that the a’s form an essential set. Hence a; is fi. 

a2 is contained in some 8 which cannot be f;. Suppose that it is 62. Then 
Be is in a2 and is a. Continuing in this manner the theorem is proved. 


THE BASIS THEOREM 
4. In what follows we will need the following 


Lemna. If the perfect differential ideal {a} has a finite basis, it has a finite 
basis consisting of elements of oc. 


Let si, ---, Sn be the elements of a finite basis of {¢}. Each s; as an 
element of {a} isan element of {o,;} where a; is a suitably chosen finite subset 
of o. Every s;isin {o:,---,on}, hence {co} is {o:,---,on}. 

5. We prove the following theorem: 


THEOREM 7. The differential ring R of forms in a finite set of indeterminates 
Vi, °° *, Vn with coefficients in a differential field & of characteristic zero is a 
differential ring with a basis theorem. 


We suppose that the theorem is not true and force a contradiction. 


Lemma. Let = be a perfect differential ideal without a finite basis. Let 
F,,- ++, F, be forms such that by multiplying each form of = by some product 
of non-negative powers of F:, - - - , F, a system A is obtained such that {A} has 
a finite basis. Then {2, F, - - - F,} has no finite basis. 


Suppose, as we may by the lemma of the preceding article, that 
is {2,Fi---F.}, where Mi, - - -, are forms 
of . Let a finite basis of {A } be chosen from the forms of A and let Ki, - - - ,K» 
be forms of 2 such that the forms which they yield, after the above described 
multiplications, form this basis of {A}. Let II be the totality of H’s and K’s. 
Then {II, Fi -- - F.} is {2,F, - - - F,} and {II} contains {A}. 

Since 2 has no finite basis, there exists a form L of = not in {II}. Some 
Fa .-- FL isin {A} and hence in {II}. Consequently, if g is the maxi- 
mum of the g,’s, Fi - - - F,eL* and hence F, - - - F,L are in {I}. L is in 
{ll, fF: - --#,} by our assumption and obviously in {II, L}; hence by §2 
it isin {II, F, - - - F,L} which is {11}. This contradiction proves the lemma. 


i 


1934] IDEAL THEORY AND DIFFERENTIAL EQUATIONS 367 


Lemma. Let and {2, Fi-- F,} be perfect differential ideals having no 
finite basis sets. Then at least one of the perfect differential ideals {2, F:},---, 
F.} has no finite basis. 


We may limit ourselves to the case of s=2. Let {2, Fi} and {2, Fz} be 
{,, Fi} and {&., F.} respectively, where ®, and 4; are finite sets taken, ac- 
cording to the lemma of the preceding article, as subsets of 2. {2, Fi} and 
Fe} are also Fi} and {&,, &,, respectively. {2, FiF2} is the 
intersection Fi} A by §2 and hence also {4,, 2, FiF:} which 
contradiction proves the lemma. 

We consider the totality of perfect differential ideals of R without finite 
basis sets.* We form a basic set} for each. By a lemma of Ritt’st, we know 
that there is a perfect differential ideal = without a finite basis whose basic 
sets are not of higher rank{ than the basic sets of any other perfect differ- 
ential ideal without a finite basis. Let 


(1) 


be a basic set of 2. Then A; is not an element of $, as otherwise 2 would have 
unity as a finite basis. 

For every form of 2 not in (1), let a remainder§ with respect to (1) be 
found. Let A be the system composed of the forms of (1) and the products of 


the forms of = not in (1) by the products S$." --- S,* J" ---JI,* of the 
separants|| S; and the initials|| 7; of the forms of (1) used in their reduction.| 
Let Q be the system composed of (1) and the remainders of the forms of Z 
not in (1). 

{Q} has a finite basis. If not, 2 would have non-zero forms not in (1). 
Such forms would be reduced{ with respect to (1) and {2} would have lower 
basic sets than 2, contradicting our assumption. Consequently {A} has a 
finite basis, for {A} is {2}. 

The lemmas show that some {2, S;} or some {2, J;} has no finite basis. 
But for every 7, S; and J; are distinct from zero, and reduced with respect to 
(1). Hence the basic sets of {2,.S;} and {2, 7;} are lower than (1). This con- 
tradiction proves the theorem. 


* Cf. Ritt, loc. cit. In what follows we use the concepts and theorems of §§2 to 5 in this book. 
The reader will have no difficulty seeing that these articles with slight changes in language apply to 
our abstract forms. 

Tt Ritt, loc. cit., p. 6. 

t Ritt, loc. cit., p. 4. 

§ Ritt, loc. cit., p. 9 and p. 7 for the existence. Notice that in case of non-zero characteristic a 
separant may vanish. 

|] Ritt, loe. cit., p. 7. 


4 


368 H. W. RAUDENBUSH 


THEOREM 8. Any infinite system of forms contains a finite subset such that 
every form of the system has a power which is a linear combination of the forms 
of the subset and their derivatives with forms for coefficients. 


This follows at once from Theorems 1 and 7, and the lemma of the pre- 
ceding article. 


ANALOGUE OF THE HILBERT-NETTO THEOREM 
We prove the following 


THEOREM 9. Let = and G be a system of forms and a form respectively of the 
differential R of forms in a finite set of indeterminates y1,--+-, Yn and with 
coefficients in a differential field § of characteristic zero. If G is not in {2} then 
there exists a set of elements a, - - - , @, of an extension of § such that every form 
of = vanishes when the a’s are substituted for the indeterminates and such that 
G does not vanish for the same substitution. 


Let Ih, - - - , II, be prime perfect differential ideals whose intersection is 
{=}. Some II, say II’, does not contain G. By a theorem of the author’s dis- 
sertation,* there exists a set of elements a, - - - , @, of an extension of $ such 
that II’ is the set of forms of R that vanish when the a’s are substituted for 
the indeterminates. The a’s are then solutions of 2 but not of G. 

In what follows, we suppose that § is a differential field of functions of a 
complex variable x meromorphic on an open region %. Let II’ have Ai, ---,A, 
as its basic set. A, is of class greater than zero.f Let S; and J; be the separant 
and initial, respectively, of A;. We show that the basic set has analytic solu- 
tions, when regarded as polynomials in the y,; that they containf{, for which 
G and no separant or initial vanishes. We suppose that every analytic solu- 
tion of the basic set is a solution of T=S,--- Spl, -- -I,G. Then by the 
Hilbert-Netto theorem for polynomials, T is in II’. This contradicts the fact 
that II’ is prime, for II’ can contain no separant or initial of the forms of its 
basic set, and was chosen so as not to contain G. For a suitable value of x 
the values of the analytic functions in such a solution provide initial condi- 
tions for a regular§ analytic solution of the basic set which is not a solution 
of G. By a theorem of Ritt’s||, such a solution is a solution of II’ and hence 
of 2. This together with Theorem 1 gives Ritt’s result (b). 

* Raudenbush, loc. cit., p. 517, Theorem V. 

t Ritt, loc. cit., p. 3. 

t Certain of the ys; may be indeterminate. 


§ Ritt, loc. cit., p. 20. 
|| Ritt, loc. cit., p. 25. The theorem is true if the system is not closed, provided it is an ideal. 


BARNARD COLLEGE, COLUMBIA UNIVERSITY, 
New York, N. Y. 


| 


DIFFERENTIABLE FUNCTIONS DEFINED 
IN CLOSED SETS. It 


BY 
HASSLER WHITNEY{ 


1. Introduction. In a recent paper§ the author has shown that if a func- 
tion f(x) defined in a closed set A in n-space E satisfies certain conditions 
involving Taylor’s formula (in finite form), i.e. if it is “of class C™ in A,” 
then its definition can be extended over E£ so that it will have continuous 
partial derivatives through the mth order. In this paper we restrict ourselves 
to the one-dimensional case. (For the above theorem in this case, see §4.) 
Let xo, - - - , %m be distinct points of A. If P(x) =co+ - - - +¢nx™ is the poly- 
nomial of degree at most m such that P(x;) =f(x;)(¢=0, - - - , m), the mth 
difference quotient of f(x) at these points is Ao... mf =A"f(x) =m!cm. The main 
object of this paper is to prove (see §§2 and 3 for definitions) 


THEOREM I. A necessary and sufficient condition that f(x) be of class C™ 
in A is that A™f(x) converge in A. 


This theorem furnishes a direct definition of the differentiability of a func- 
tion; the former definition (see §3) involved the existence of other functions 
file), 

The necessity of the condition is easily proved. The definition of f(x) be- 
ing extended over the x-axis E, consider any m+1 points %,---, Xm 
-- + <%m). Define P(x) as above. As f(x;) —P(x;) =0(¢=0, - - - , m) 
there is a point x’(%9<x’<x,) such that (d"/dx™) [f(x’)-—P(x’)]=0. But 
d™P (x)/dx™=m!Cm=Ao... mf; hence Ao... mf=d"f(x’)/dx™. Therefore if 
Xo, +, are in A and are sufficiently near a point x* of A, Ao... mf 
=d"f(x’)/dx™ =df(x*)/dx™ approximately, and Af(x) converges in A (in 
fact, in Z). This may be proved also from (2.6) for s=m. 

We note that, for f(x) =fo(x) to be of class C™ in a general closed set A, 
it is not sufficient that there exist functions f,(x) (s=1, - - - , m) in A such 
that df,(x)/dx =f.4:(x) there. As an example, set fo(0)=0 and fo(x) =1/2% 
i=1, 2,---), and set f,(x)=0 and f.(x) =0 in the same 
point set A. 

The majority of the paper is devoted to the proof of Theorem I. In the 

t Presented to the Society, October 28, 1933; received by the editors July 27, 1933. 

t National Research Fellow. 

§ Analytic extensions of differentiable functions defined in closed sets, these Transactions, vol. 36 
(1934), pp. 63-89; this paper will be referred to as A.E. 


369 


370 HASSLER WHITNEY [April 


last section we study Taylor’s formula in finite form, when it holds in closed 
sets, and when its validity implies differentiability of the given function. 
2. Difference quotients. {If xo, - - - , Xm are distinct numbers, sett 


1 


Given a function f(x), we define the mth difference quotient by the formula 


(2.2) A” f(x) = A(x, %1,°°° Xm; f) = = m! » a 
t=0 
In particular, Aof =f(x0), Auf = [f(x1) —f(xo) |/ (#10). Ao..-m is symmetric 
in the points xo, , Xm. 
If «22, 
(ai... 
%o1 


hence 


$ s! 0 1 
— (Aiz.... — Aos...2) = =|- + 
Uo1 Uo1 


= s! (xi) = Aoie---s- 
i=0 
Suppose * is a set of subscripts containing neither 0, 1, nor 2; then for 
some m, 


m m 
Aoize = —— (Aize — = —— (Aize — Acie). 


Solving for Aow, we find 


Uo2 
(2.4) = — + — Anis, 


Uo1 


which may be written as follows: = 0. 
Let xo,---, %» be distinct numbers. If we solve the equations )>i5 


+ Compare Nérlund, Differenzenrechnung, Berlin, 1924, pp. 8-9. It is seen that Ag... m=m! 

t In the equations below, the numbers 0, 1, - - - , when appearing as subscripts, are to be consid- 
ered as variables. Thus, as a particular case of (2.1), a$.3=1/(wsowso); in the second equation of §6, 
Ej1/uj=1/uorx+ + - + . Without this notation, the equations would often get quite cumbersome. 


| 

| 


1934] DIFFERENTIABLE FUNCTIONS 371 


(x;—x) ‘2; =5;.(7 =0, - - - , s), x being any fixed number, we find 
Hence 


coh 


2) = 1. 
t=0 


Suppose f(x) =fo(x),---, fm(x), R(x’, x) =Ro(x’, x) satisfy (3.1) below 
for s=0. Then (2.2) and (2.5) give 


(2.6) 


(2.5) 


> fiz) 


J! im 


If f(x) =cot --- +¢nx™ is a polynomial of degree at most m, then (3.1) is 
satisfied with f,,(x) =m!c,, and R,(x’, x) =0. Setting s =m in (2.6) gives 


(2.7) Ao..-mf = mcm. 
We say A™f(x) converges in the set A if for each point x of A and every 


e>0 there is a such that if xo, - - - , %m, Xo, are any two sets of 
distinct points of A, all within 6 of x, then 


| Ao...m — Aov...me| <e. 


A”f(x) of course converges at all isolated points of A. We say A"f(x)—>f,.(x) 
in A if |Ao...m—fm(x)| < € whenever xo, - - - ,%m are in A and within 6 of x. 
Evidently if A"f(x)—f n(x) in A, then fm(x) is continuous in the set of limit 
points of A. 

DIFFERENTIABLE FUNCTIONS 


3. Definition of differentiable functions. Let f(x) =fo(x) be defined in the 
closed set A. We say f(x) ts of class C™ in A (see A. E.) if there exist functions 
filx),--+,fm(x), R(x’, x) =Ro(x’, x), - - - , Rm(x’, x) in A such that 

bed i(2) 
ime (2 — S)! 
and for each s, each point x of A, and every e>0 there is a 5>0 such that 
R(x", x’) 
(3.2) ———_——- | <e (x’, x” in A; |x’ — x|, |x” — x| <8). 
x’) m—s 


If fix), ---, fm(x), Ri(x’, x) satisfy (3.1) and (3.2) for s=i, we say f;(x) 


(x’ + R,(x’, x) (s 0, m), 


372 HASSLER WHITNEY [April 


can be expanded in a Taylor’s formula to the (m—i)th order locally uniformly 
in terms of f(x), ---,fm(x). If f(x) is defined throughout an open interval 
and has a continuous mth derivative there, then it is of class C”, by Taylor’s 
theorem. 

4, Extension of differentiable functions. Jf fo(x) is of class C™ in terms of 
So(x), ---,fm(x) in A, then the definitions of these functions can be extended 
throughout E so they will be continuous and so that df,(x)/dx=f.+:(x) there 
(s=0,---,m-—1) (see A. E., Lemma 2). As the proof can be given more 
simply in the one-dimensional case, we give it here. We can assume A is un- 
bounded on both sides; otherwise, take a point a beyond A on either side, and 
set f,(x)=0 (s=0, - - - , m) beyond a. 

For each interval (a, b) of E—A, let P(x) be the polynomial of degree 
at most 2m+-1 such that 


(4.1) P(a) = f,(a), P(b) =f.(b) (s =0,---,m); 
dx* dx* 

we set 


(4.2) f,(x) = : P(x) in (a, d). 
dx* 
df,(x)/dx =f.s1(x) (s=0, - - - ,m—1) in E—A; we must show that this holds 
also at any point x» of A. 
Suppose each f,,:(x) is continuous in Z. Then given xo in A and e>0, 
take 5 >0 so small that 


I — | < = — <8). 


By (3.1) and (3.2), we can also take 5 so small that if a is in A, |a—xo| <4, 
and 


= fa(%0) + feri(%0)(a — x0) + R’(a, xo), 


then | R’(a, x0)/(a—xo)| <¢/2. Now take any point x within 6 of xo. If x is 
in A, set a=2; otherwise, let a be the end point nearest x» of the interval of 
E-A containing x. Now for some x’, a<x’ <x, 


f,(x) = + Sorr(x’) (x a). 
Adding this to the last equation and dividing by x —o, we find 


s Ja\+0 R’ 
Sls) = Jaleo) = + [fer1(x’) | + 
%— Xo 


~ 
4 


1934] DIFFERENTIABLE FUNCTIONS 
As | x’—x0| <8, |x—a| <|x—x9| and |x—x9| >|a—xo], 


f(x) Sexo) 


— fers(%o) | <e (|x — x| <8), 


as required. (We have given here the details of A. E., Lemma 1.) 

We must prove still that each f,(x) is continuous at each point 2% of A; 
it is of course true in E—A. As f,(x) is continuous in A, it is sufficient to 
prove that for every «>0 there is a 6 > 0 such that if (a, 5) is any interval 
of E—A lying within 6 of xo, then 


\fe(x) — fa(a)| <e (asxb). 


Take e’ <e/[2(m+1)?K], where K is a number to be determined later. Let 
M be the maximum of |f,(x)| in A (|x—x0| <1, i=0,---, m). Take 
5<e/(2mM) and <1 so small that (3.2) holds with replaced by e’ for any 
x’ within 6 of xo. Now take any interval (a, 6) of E—A lying within 6 of xo. 
In (a, 6), f(x) equals 

) 2m+1 


t=0 t=—m+1 


where the 7; are determined by the relations 


m 
=) (b—a)*+ (6 — a)** = f,(0); 


dx* (i s)! i—m+1 (i 


hence 


2m+1 m 
(6 — = — = (6 — a)** = R,(, a). 


— 5)! ime (7 — 


Solving for the y;, we find 
R _Ri(b, a) a) 


= K; 


where the K;; depend on m alone. Set K =max | K ail ; then 


bend K R,(6, a) (m + 1)K 


In| 


|b — |b — 


Now if x is any point in (a, 5), then |x—a| <|b—a|, and 


| 


374 HASSLER WHITNEY: 


d 


Qm+1 |x — 


<mM\x—a\+(m+1)Ké > 


i=m+1 lb 
< mMéi + (m + 1)?K |b — <.«, 
as required. 
THEOREM I, A PERFECT 

5. A succession of lemmas culminates in Lemma 7, which is the suffi- 
ciency part of Theorem I for perfect sets. 

Lemma 1. Let A be a closed set, and let A*f(x) converge on A. Then we can 
define f,(x) on the set of limit points A* of A so that the following is true. Given 
x in A* and e>0, we can choose a 6>0 so that if xo, - - - , x, is any set of dis- 
tinct points of A lying within 6 of x, then |Ao... .—f.(x)| <e. 

The proof is simple. 

Lemma 2. If A*fo(x) converges in the perfect set A, then for each point x of 
A and every e>0 there is a5>0 such that 


(5. 1) | Bog. | <e 
(0<t<s) whenever all the points concerned lie within 5 of x. 

This is trivial if =0. We assume it holds for numbers 0, - - - , #—1, and 
shall prove it for ¢. Given a point x, distinct from all former points, the equa- 
tions 


Uos 


Uo's 


give 


1 


As Ao..., converges, we can take M>O and 8’<e/(4M) so that |Ao....| <M 
whenever %o, : - - , x, are within 5’ of x. By induction, we can take 5<5’so 
small that the first term on the right in (5.2) is in absolute value <¢/2 when- 
ever all points concerned are within 6 of x. Now given the points %, - - - , %»-1, 


[April 


1934] DIFFERENTIABLE FUNCTIONS 375 


Xo, +--+, X¢-»’ within 6 of x, let x, be another such point; then (5.2) gives 
(5.1). 

The lemma with ¢=s shows that A*~'f(x) converges in A. 

Lemna 3. If A"fo(x) converges in the perfect set A, then there are continuous 
functions fi(x), - - - , fm(x) in A such that A*fo(x)—>f,(x) in A (s=1, m). 

We prove this successively for s=m, m—1, - - - , 1 with the help of Lem- 
mas 1 and 2. 

6. We proceed to the following lemma. 

Lemma 4. If Ag(x)->g,(x) and A'g(x)—>g:(x) in the perfect set A, then 
in A. 

Set g=p—1. If we apply the relation to a}... 
as a differentiable function of x;, we find 


‘ 1 
= — — + 


where e(x;)0 as Hence 
; 1 
s 
@.--¢ 25 
(65% 
AQ. 


where ---, as =0, - - - , g). Consider the 2g points 
Xo, * Xq. We have 


plese 


i—0 ii’ 


q ‘ q 1 @ 
= (p — DY >> —-+ 


j= 0 


i=0 ii’ 
As A'g(x)—g;(x), this gives, letting (j=0, - - - , q), 
1 
(6.1) > Ao... q! Qo... = Ao..-¢1- 


i= 


1 
+ a0) 

7=0 

| : 


HASSLER WHITNEY 


Now given a point x of A and an e>0, take 5>0 so that if x, - - - 
Xj, +++ ,%q are within 6 of x, then 


gp(x) | <€. 


Then if xo, - - - , x, are within 6 of x, we find by adding points x», - - - 
within 6 of x and letting x; >; (¢=0, - - - , g) that 


|Ao..-081 — 
as required. 

Lemma 5. If A™fo(x) converges in the perfect set A, then there are continuous 
functions fi(x), - fm(x) such that in A. 

This follows from Lemmas 3 and 4. 

7. We now present the two final lemmas needed for the proof of Theorem I 
when A is perfect. 

Lemma 6. Let g(x) =go(x), - - - , gs(x) be defined in the perfect set A, and 
suppose A*g(x)—g,(x). If g(x) can be expanded in a Taylor’s formula to the 
(s—1)th order in terms of go(x),--~-, Se-1(x), then it can be expanded in a 
Taylor’s formula to the sth order locally uniformly in terms of go(x), - - - , gs(x). 


Given a point x of A and an e>0, take 5>0 so that 


! 
(7.1) 


2 3 
whenever Xo, - - - , x, are within 6 of x (recall that g,(x) is continuous, by §2). 
Take any two points x and x, of A within 6 of x; we must show that 
| R®(2,, x0)| <e. 
Take 4’ so small that if | «’—x,| <6’, then 


Xo) Tos € 
where RY (x’, x0) = g(x") (x’ —a0)#/j!. Take M >| g,(xo)|. Take 


a point x,_, in A within 6’ of xo and so close to x that 


s! € 
< — 
Tos 3 


1 
and <—» 
2 


and (if s>2) take in succession points x,_2, - - - , %: in A so that 


(7.2) < foe 
let these points lie within 6 of x. Then if i<s, 


376 | [April 


DIFFERENTIABLE FUNCTIONS 


Toi Tos € 1 


1 € 
Tos * Timi 2s 3 2s 3 


1 s—1 s—1 = 

— Ao... = + a‘| + R‘ ” (xi, 
s! i=0 imo 

= atg(x,) — a? + (xi, *0), 


j=0 j! 
on account of (2.5). Therefore 


(x, ,%0) = ui, 
(7.3) A (x uo s—l ai 


s! imo 
and as rie/Tos (ros +7 


R“)(x,, Xo) Tos * * Te—1,8 
| — | + 


Os s! 


s—1 
+ x Tos Ts—-1,8 xo) | 


3] r T0,s—2 
s! 28 Tos T0s 


as required. 

Lemma 7. If A™f(x) converges in the perfect set A, then fo(x) =f(x), 
filx), -- +, fm(x) can be defined in A so that f(x) is of class C™ in A in terms 
of the f,(x) (s=0,- +--+, m). 

We define f(x), - - - , fm(x) by means of Lemma 5. Taylor’s formula for 
each f,(x) holds to the Oth order, as f,(x) is continuous (see §2). We prove in 
succession that it holds to the kth order for k=1, - - - , m—s. This completes 
the proof of the lemma, and therefore of Theorem I for the case that A is 
perfect. 


P-SETS AND Q-SETS 
8. We shall prove a lemma which will be needed in the next part. Let 


A’=%, a2,--- bea set of isolated points, at least m+1 in number. With 
each point a; we shall associate m other points a;,, - - - , a;,,; these m points 


1934] 377 
Now 
| | Tos * Te—1,8 _ 
Tos 
— €, 
23 3 - 


378 HASSLER WHITNEY [April 


together with a; we say form the Q-set Q(a,;). Take a Q-set Q;, and let 
, @;, be all those points such that Q(a;,) =Q,; these points form the 
P-set P; corresponding to Q;. Each point of P; is in Q;. Each point a; lies in 
just one P-set P(a,), as a; is associated with just one Q-set Q(a,); however, a; 
may lie in several Q-sets. Let 5(Q;) be the greatest distance between pairs of 
points of 

Lemma 8. The P-sets and Q-sets may be so chosen that for any two points 
a; and a;, 


+ 6(Q(a;)) 


la; — 


(8.1) if > 2m, then P(a;) = P(a;). 

We first associate sets of points with certain of the limit points of the 
points a, ad, - - - as follows. Let c; be a point such that there is a sequence of 
points of A’ approaching it from one side, say the left, while there is a nearest 
point of A’ to c; on the other side of c;. Let u equal m+1, or the number of 
points a; between c; and the next limit point ¢ to the right of c; if that num- 
ber is smaller, and let a;,, - - - , @;, be the points nearest c; on the right (count- 
ing from left to right). Let + be the smallest of the numbers | a; —a;,| 
(s,#=1, - - -,m) whichare >|a;,—c;|, if there are such. Let a;(c,;) be a point 
of A’ to the left of c; such that 


if r is defined. Let a2(c;), - - - , @m(c;) be points of A’ lying between c; and 
a;(c¢;). 

We now define the Q-sets. Given a point a;, we associate another point 
with it as follows. Suppose, Case I, there is a point a; whose distance from 
a; is less than or equal to the distance from any other a to a;; then we asso- 
ciate a; with a;, or that one of the pair a;, a, which lies to the left of a, if 
their distances from a; are the same. Suppose, Case II, there is no such point. 
Then there is a limit point c; nearer a; than any point a;. If there are two such 
points, we consider that one c; on the left. The point we associate with a; is 
then a;(c;). 

Suppose now we have associated a number of points with a;, forming the 
set of points S. We associate the next point in a fashion much the same as 
above. If Case II has not occurred in associating the other points of S with 
a;, we again have two cases to consider. Case I, there is a nearest point a; 
to the set S; we then associate this point with S (or the point a, as above). 
Case II, there is none; then take the point c; as above, and associate a;(c;) 
with S. At any time we employ Case II, we immediately associate also the 


1934] DIFFERENTIABLE FUNCTIONS 379 


points d2(c;), as(c;), - - - with S, till we have the required m+1 points Q(a;). 

Note that the point we associate with S does not depend on which point 
a; of S we started with. Also if Case I has occurred each time in forming the 
subset S of Q(a;), then there is no point a not in S which lies between two 
points of S. 

9. To prove that (8.1) holds take any two points a; and a;; set 7;;= | a;—a;| . 

(1) Suppose there are at most a finite number of points of A’ between a; 
and a;. If 5(Q(a;))+6(Q(a;)) >2mr,;, then either 5(Q(a;)) >mr.; or 5(Q(a;)) 
>mr;;, say the former. Then there is a first time when, on adding a point a 
to a set S in forming Q(a;), the distance from a; to Sis >r;;. 

(a) In forming S from a;, Case I has occurred each time. For if Case II 
had occurred, say in adding the point a;(c;) to the subset S, of S, then a 
would be some a,(c;); but the distance from a to S is then at most the dis- 
tance from a,(c;) to a:(c;) which is less than the distance from a;(c;) to S; 
which is by hypothesis <7;;. 

(b) There is no point a, whose distance from S is <r;;. For suppose there 
were; then Case II must occur in adding a,=a;(c;) to S, and c; is nearer S 
than any point a,. (If Case I occurred, a, or a nearer point, not a, would 
be added to S.) Say c; lies to the left of S. Let a, and a, be the left and right- 
hand end points of S respectively. As there is a point a, distant <r,; from S, 
|a,—c,| <r,;. Suppose a; is not in S. As there are no limit points between a; 
and a;, a; lies to the right of S, and hence there is a first point a, to the right 
of S. Then as a; is in S, | a,—ag| <r;;. But as a, and a, are among the first 
m-+1 (or u) points to the right of c;, and |a,—c;| <|a,—a,|, (8.2) gives 
|a,—a| <|a,—a,| <r,;, a contradiction; therefore a; is in S. As a; is in S 
and |a,—c;| <ri;, (8.2) gives |ap>—ax| again a contradiction. 

(c) S contains a;. For otherwise (b) would be contradicted. 

(d) In forming Q(a;), the points of S are chosen first. For suppose not. 
Then after perhaps adding some points of S to a;, forming the set S’, we 
choose a point a; not in S. By (b), the distance from a; to S is >r;;. As there 
is a point in S whose distance from S’ is at most 7;;, a; must have been chosen 
under Case II; then the distance from some point ¢; to S’ is <7;;. But then 
as ¢; is a limit point of points a,, there is a point a, whose distance from S is 
<rj;, a contradiction. 

Now in forming both Q(a;) and Q(a;), the points of S are chosen first. 
As the remaining points chosen depend only on S, Q(a;) and Q(a;) must coin- 
cide; hence a; and a; lie in the same P-set. 

(2) Suppose there is a limit point of isolated points 6 between a; and a;. 
In forming Q(a,), the set S at any step is at a distance < | b—a,| from 5; hence 
in adding the next point a; to S, its distance from S is <|b—a;,| if Case I 


380 HASSLER WHITNEY. [April 


occurs, and is <2|b—a,| if Case II occurs, by (8.2). Therefore 5(Q(a;)) 
<2m|b—a;|. Similarly 5(Q(a;)) <2m|b—a,|. Adding, 


5(Q(a;)) + 8(Q(a;)) < 2m(|b a; | + |b a;|) = 2mr;;, 


completing the proof. 

Remark. Given a point a;, if there exist m points a;,, - - - , a;,, such that 
the m intervals between aj, a;,, - - - , a;,, are all <p, or if there exists a point 
a not in Q(a,) within p of a;, then 5(Q(a,;)) <2mp. This follows from the proof 
in (2). 

THEOREM I, A CLOSED 


Each isolated point of A is enclosed in an interval; this gives a perfect 
set B. The definition of f(x) is extended over B. With the help of Lemma 8 
it is shown that A”f(x) now converges over B. By Lemma 7, f(x) is of class 
C™ in B; hence the same is true in A. 

10. The sets A’ and B. Let A; be the set of isolated points of the closed 
set A, let Az be the set of limit points of isolated points, and let A; be the 
remaining points of A. Let A’ consist of A1, together with certain other points 
as follows. Ai1+Az being closed, let J be any open interval of E—(A:+<A2) 
containing points of A;. If an end point a; of J is in Ai, then there is, in J, a 
. hearest point a;(a;)-of A; to a;. We associate this point with a;, and also 
points a2(a;), - - - , dm(a;) of Az in J, chosen so that 


(10.1) |a,(as) a,(a;) | < | — a;| 


A’ isa set of isolated points; we may name them 4, dz, - - - . A’ is contained 
in A itA 3- 

For each point a; of A,, let d(a;) be its distance from the rest of A, and 
let B; be a closed interval of length d(a;)/2, with a; as center. Let the perfect 
set B be A plus all of these intervals. Arrange the points of A’ into P-sets 
and Q-sets so as to obey Lemma 8. For each P-set P;, let the corresponding 
P’-set P/ contain the points of P;, together with the points of any intervals 
B; there may be which enclose points of P;. 

Given any set S of m+1 points in B, we shall define its complexity o(S) 
as follows. If all the points of S are in A, set o(S)=0. If S contains p>0 
points in B—A, and all these points lie in a single P’-set P/, let g be the 
number of remaining points of S which do not lie in the corresponding Q-set 
Q;, and set o(S) = pg. The complexity of S is in this case certainly <m?. If S 
contains p points in B—A, and these points do not all lie in the same P’-set, 
set a(S) =m?+p—1. The complexity of any set Sis <m?+m. 

11. The following lemma together with Lemma 7 gives Theorem I. 


(s = 2,---,m). 


1934] DIFFERENTIABLE FUNCTIONS 381 


Lemna 9. Let f(x) be defined in the closed set A so that A"f(x) converges in A. 
Then A can be enclosed in a perfect set B, the definition of f(x) can be extended 
over B, and f(x) can be defined in B, so that A"f(x)—>fn(x) in B. 

Define the sets A’, B etc. as above. We may assume there are at least 
m-+-1 points in A’. Define f,,(x) at each point of A2+A; as in Lemma 1. Take 
a fixed interval B; with center a;; we define f(x) and f,,(x) over B; as follows. 
Let Q;=Q(a;) be the corresponding Q-set. Let 


be the polynomial of degree at most m such that R,(x) =f(x) at each point 
of Q;; then A(Q;) =m!ym, by (2.7). Set 


(11.2) f(x) = Ri(x), fm(x) = mlym in B;. 


The same polynomial R;(x) is used in defining f(x) and f(x) over each in- 
terval of the P’-set P/ corresponding to Q;; hence if S is any set of m+1 
points such that all of its points in B—A lie in P/, and all remaining points 
lie in Q;, then f(x) =R,(x) at each point of S, and therefore, by (2.7), A(S) 
=m!7m=A(Q;). 

Each point x of A; is at a positive distance from B—4A; by the definition 
of f(x), A"f(x)—fm(x) at such points. Each point x of B—(A:+4As) is in 
an interval B;; hence near x, f(x) is a polynomial, and A"/(x)—f,,(x) there 
also. It remains to show that for each point x of A, and every e>0 there is a 
5>0 such that if S is any set of m+1 points of B within 6 of x, then 


(11.3) | A(S) — fm(x)| <e. 
By Lemma 1, we can take 6’>0 so that 


(11.4) | A(So) — fm(x)| < 


(8m + 8) m*+m 


for any set So of m+1 points of A lying within 6’ of x. Set 5=5’/(4m-+2). 
We shall prove the following: 
(A) If S is any set in B, of complexity o(S) =o, composed of sets of points 
Si in B—A and S; in A, and if S, lies within 5 of x and S; lies within 5’ of x, 
then 
€ 


(8m + 8) 


(11.5) | A(S) — | < = 


As ¢Sm?+m, (11.3) follows. 
12. We note first that if b; is in some interval of the P’-set Pj, and b; lies 
within 5 of x, then Q; lies within 8’ of x. Say a; is the center of B;; then a; lies 


382 HASSLER WHITNEY [April 


within 26 of x, a limit point of points of A’. Hence 6(Q(a;)) <4mé, by the re- 
mark at the end of §9, and 0; =(Q(a,) lies within (4m+2)5=3' of x. 

We shall prove (A) first for ¢ =0, then for  >0, using induction. Suppose 
o=0. If S isin A, the fact follows from (11.4). If S contains points of B—A, 
then all these points lie in a single P’-set P/ , and the rest of S lies in the cor- 
responding Q-set Q;; hence A(S) =A(Q;). Q; lies within 5’ of x; hence (11.4) 
holds with So replaced by Q; or by S, and therefore (11.5) holds. 

Now suppose (11.5) is proved for all sets S’ with a(S’) <a; we shall prove 
it for any set S with o(S) =o. Suppose first ¢ >m?*; then the points of S in 
B-—A lie in at least two P’-sets. Let P{ and P} be two of these sets, let 5; 
and 6; be points of S (in B—A) in P/ and P} respectively, and let a; and a; 
be the centers of the corresponding intervals. Let a, be a point of Q(a,) not 
lying in S. If S’=S—6;—b;, then, by (2.4), 

b; b; — 
(12. 1) A(S) bj, S’) A(bi, ak, 5S’) + A(ax, bj, S’). 

b; — b; by Be 
The sets S’+5;+a, and S’+5;+a, each contain fewer points of B—A than 
S; hence their complexities are each <o. Also Q(a;) and therefore a, lie 
within 5’ of x. Therefore, by induction, 


(12.2) | A(bi, ax, 5’) — fm(x) | < | A(ae, 5") — fm(x) | < 

As a; and a; lie in distinct P-sets, 5(Q(a;)) +6(Q(a;)) S2mr;;, by (8.1). As 

|b;—a,| <r;;/4 and | b;—a,| <r;;/4, |b; 2r;;/2. As a and a; lie in Q(a,) 

and Q(a,) respectively, | a;—ax| <5(Q(a,)) +6(Q(a;)) S (2m+1)r;;; hence 

|b; —ax| <(2m+2)r;;. Also | a,—b;| <(2m+2)r;;; hence 

| b; — a 
— 


(12.3) <4 +4 
m 


This with (12.2) and (12.1) gives 


< 4m + 4. 


| A(S) — fn(x)| < 


a, — 5; 
im 


| | A(bs, ax, S’) — fm(x) | 


b; — a 
+ | | A(ax, b;, S’) fm(x) | 

b; — 

< (8m + = &, 

as required. 

Suppose now 0 <a <m?; then the points of S in B—A lie in a single P’-set 
P{ , and there are points of S not in P/ +Q;. Let b; be a point of Sin B—A, 
let a be a point of S not in P/ +(Q,, and let a, be a point of Q; which is not in 
S. If S’=S—b;—a, the sets S’+6;+a, and S’+a+a, each have a smaller 


1934] DIFFERENTIABLE FUNCTIONS 383 


complexity than S. a lies within 5’ of x, and hence, by induction, (12.2) holds 
with 5; replaced by a. Let a; be the center of the interval B; containing 4;. 

Suppose, (1), a=a; isin A’. Then |a;—);| >r;;/2. As a, is in Q;=Q(a,) 
while a; is not, |a,—b;| <|a,—a;| +7:;<(2m+1)r;;, by the remark, and 
| a;—a,| <(2m-+1)r;;. Hence (12.3) holds with }; replaced by a;=a, and 
(11.5) follows just as before. Suppose, (2), a isin A —(A’+Az2). From a, move 
toward a; to the first point a’ in A,+ Az. If a’ is in Ai, move back to the first 
point a:(a’) in A;. Then | a:(a’) —a’| < |a—a,| and | a,(a’) —a,(a’)| <|a;(a’) 
—a’| <|a—a,| (s=2,---, m), by (10.1). Hence 5(Q,;) <2m|a—a,|, by the 
remark, and |a,—b;| <(2m+1)|a—a,|, and |a—a,| < (2m+1)|a—a,|. As 
|a—b,| >|a—a,| /2, (12.3) and (11.5) follow, as before. If a’ is in As, there 
are m points of A’ nearer a; than a, and again 5(Q,;) <2m|a—a,| and (11.5) 
follows. Suppose finally, (3), a isin As. Again we must have 5(Q;) <2m|a—a,| 
and (11.5) follows. This completes the proof of (A), therefore of Lemma 9, 
and therefore of Theorem I. 


TAYLOR’S FORMULA 


13. Conditions under which Taylor’s formula is valid. Taylor’s formula 
for f(x) may hold to the mth order in certain closed sets even if f(x) is not of 
class C™ (see §14). We find here a difference quotient condition equivalent to 
the validity of Taylor’s formula, at least for perfect sets. 


Lemna 10. If f(x) =fo(x) can be expanded in a Taylor’s formula to the mth 
order locally uniformly in terms of fo(x), - - - , fm(x) in the closed set A, then 
these functions are continuous in A. 


It is apparent from (3.1) and (3.2) with s =0 that fo(x) is continuous. Take 
any s,0<s<m. We shall assume f;(x) is continuous for s <j Sm, if there are 
such values of 7, and shall prove that f,(x) is continuous. 

Let xo, - - - , x, be distinct points of A. If we subtract (2.6) with x re- 
placed by x» from the same equation with x replaced by x, we find 


j=s+1 


— re 
(13 1) j! imo 


+s! > — R(x, 
i=0 


Given any limit point x» of A and any e>0, take 6<e/[2*+*(s+1)mM | (if 
s<m) and <1/2 so small that (3.2) holds with x and ¢ replaced by xo and 
e/(2*+2(s+1)!] respectively, where M =max | f,(x’)| (|x’—x0| <1, s<j<m). 
If s>1, take a point x, of A within 6 of xo, and take points x,_1, - - - , x2 of 
A so that ro;<7o,i4:/3(=2, - - - , s—1). Now take any point x, within 6 


| 
he 
44 
| 
wit 
i 
$i 
| 
A 


384 HASSLER WHITNEY . [April 


of x, so that ro <ro2/3 if s>1. From (13.1) we see that | f,(1) —f.(x0)| <e, 
as required (see the proof of Lemma 6). 

Let xo, - - - ,%, be an ordered set of points. We say they form an (xo, p)-set 
(p>1), if 


(13.2) Toi-1 < 


THEOREM II. Let f(x) =fo(x), - - - , fm(x) be defined in the closed set A. A 
necessary condition that a Taylor’s expansion for f(x) should hold to the mth 
order locally uniformly in terms of fo(x), - - - , fm(x) is that for each (or some) 
p>1, each s (OSs<m), each point x of A, and each e>0, there exist a 6>0, 
such that if xo, - - + , %, ts any (x0, p)-set of points lying within 5 of x, then 


Ao...ef — fa(x) | <e. 


By the last lemma, the f;(x) are continuous. Take M so that | f;(x’)| <M 
for |«’—x| <1. Take 5<e(p—1)*/[2(s+1)mMp*] and <1 so that |f,(x’) 
—f.(x)| <e/2 (|x’—x| <6), and so that (3.2) holds with ¢ replaced by 
e(o—1)*/[2(s+1)!p*]. Now take any (xo, p)-set of points xo, - - - , x, lying 
within 6 of x. Then 

Toi p 


p-—i1 


for ki. For if k <i, then ra hence ris =70i >70i(1—1/p), 
and =p/(p—1); if then ra =ro,ix1>proi, hence rei 
= roe and <p/(p—1). Replacing x by xo in 
(2.6) gives immediately |Ao.... f—fe(o) | <€/2; hence |Ao.... f—f.(x)| <e. 


THEOREM III. Jf A is perfect, then the condition in Theorem II is also suffi- 
cient. 


We shall prove successively for s=0, - - - , m that f(x) can be expanded 
in a Taylor’s formula to the sth order locally uniformly in terms of fo(x), - - - , 
f.(x). Evidently fo(x) is continuous; hence this is true for s=0. The proof 
for a general s follows the proof of Lemma 6; we need merely be careful to 
choose %,-1, , 41 SO that (¢=2,---,5). 

14. Taylor’s formula and differentiability. We shall say the set A has the 
property Z, at the point x(p>1) if there is an 7 >0 such that corresponding 
to any two points x and x; of A within 7 of x, points x2, - - - , x, of A can be 
found such that 


1934] DIFFERENTIABLE FUNCTIONS 385 


then 7;;/rk1 <p? for i~7, k~1. This condition is satisfied for instance by Can- 
tor’s set. sis any number <™m, m fixed. 


THEOREM IV.* Let A be a closed set having the property Z, for some p= p(x) 
at each point x, and let f(x)=fo(x), ---, fm(x) be defined in A. A necessary 
and sufficient condition that f(x) be of class C™ in terms of fo(x), - - - , fm(x) is 
that Taylor’s formula for f(x) should hold to the mth order locally uniformly in 
terms of fo(x), - - fm(%). 


In short, in this case, Taylor’s formula for fo(x) implies Taylor’s formula 
for each f,(x). 

The necessity of the condition being trivial, we turn to the sufficiency. 
By Lemma 10, f,,(x) is continuous. It remains to prove that for any s, 
0<s<m, f,(x) may be expanded in a Taylor’s formula to the (m—s)th order 
locally uniformly in terms of f,(x), - - - , fm(x). We shall prove this for s, as- 
suming it for numbers s+1, - - - , m. 

Let xo, - - - , x, be distinct points of A. Set 


(14.2) By = Dia wa, 


t=0 


(14.3) = = > (tos or)’ = > (- 1) 
i=0 i=0 l 
= 


where >.; means summation over all values of /. We can write (if s <<m) 


oj! jl bmg (k — j)! 
m xo) j-l C\’) 
= H —1 R, 
2, k! j/N1 + 


where 


™m 


1 
R= — Hj R(x, Xo). 


J+ 
Now if k=/>s, then on replacing 7 by k —j we find 
* For the special case that A is a closed interval, see a paper by the author, Derivatives, differ- 


ence quotients and Taylor’s formula, Bulletin of the American Mathematical Society, vol. 40 (1934), 
pp. 89-94; Theorem ITI. 


i 


386 HASSLER WHITNEY ° 
k 
J 


and if k>l=s, 


Therefore, as H;=0 (1<s) and H,=1, 


J! k! 


Putting this in (13.1) gives 


m ! ‘i 


k=s (k s)! j=s+1 j! t=0 


(14.5) 
+s! ai[R(x:, x0) — 
i=0 

Given a point x of A and an e>0, take p and 7» corresponding to x, and 
take 5’ <7 so that (3.2) holds with 6 and replaced by 5’ and [3m(s+1)!p?"] 
and with s taking on the values 0, s+1, - - - , m. Set 5=5’/(2p). Now if xo 
and x; are points of A within 6 of x, we can add points x, ---, x, of A so 
that (14.1) holds and these points will lie within 5’ of x. Then 


a uy Thi |Rj(%, xo) | € 


R;(%1, x = 
Xo) r r ri-s 3m(s + 1)! 


Tr r “ee 
01 i—1,¢ i+1,i si 01 


and similarly for the other remainder terms. Therefore | R,(x1, xo)| /ro, * <e, 
as required. 

Coro.iary. If m<2, Theorem IV holds for all closed sets. 

The only value of s we may need in the above proof is s = 1; the condition 
Z, is satisfied trivially if s=1. 

Example. Theorem IV does not hold for all closed sets, as we now show, 
using m=3. Set a;=1/2', b;=1/27*, c;=1/2%4; bf =a;+0,;, cf 
d;=a;+b;—c,(i=1, 2,---). Let A be the set of points 0, a;, c/, d;, b/. Set 
fo(0) =fi(0) =f2(0) =fs(0) =0, 


* See Netto, Lehrbuch der Combinatorik, Leipzig, 1927, §158, (27). 


[April 
=(- 0 ) = 541," 
k—l 

; s/ Xs 


1934] DIFFERENTIABLE FUNCTIONS 387 
fo(a:) = 0, fo(ci) = 0, So(d;) = 0, Sol(bi) = b2c;, 
fila) =0, file?) =0, filds) = — diez, = + deci, 
t3(a;) 0, 0, f3(d;) = 0, ) = 0. 


As a, ef, d;) =0, while A(a;, cd d;, bf) =3! [b:(bs —c;)c; as 
io , A*f,(x) does not converge at x=0, and hence f(x) is not of class C%, 
by Theorem I. However, Taylor’s formula holds for fo(x) to the third order 
locally uniformly. For a calculation shows that R(x, y) =0 whenever x and y 
are chosen from the points a,, c/ ,d;,b/ ,except that R(b/ , a;) = , c/) 
, di) =R(c/ , bf ) —2c;); hence if x and y are chosen in any man- 
ner from the points a;, ci, di, b/ , R(v, x)/(y—x)*-0 as i+. Suppose now 
x; and y; are chosen from aj, c/, d;, b, and from a;, cj, d;, bj respectively, 
jX% (or x;=0 or y;=0). If & is the larger of the numbers i, 7, then 


|R( yi, | < 2b2 ce + (b2 + (ae + bx) + + 


and as | y;—2;|*2a,°/8, R(y;, 00 as i, (ji). Hence for 
some 6>0, if x and y are any two points of A within 6 of 0, | R(y, x)/(y—x)3| 
<e. This is true also at each isolated point of A; hence Taylor’s formula is 
valid. 

Note that we may increase A to a perfect set by adding the intervals be- 
tween a; and c/ and between d; and b/ , and giving the obvious definitions of 
fo(x), -- +, f(x) there. In this example, Taylor’s formula holds to the re- 
quired order for neither fi(x) nor f2(x). 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


NORMAL DIVISION ALGEBRAS OVER A 
MODULAR FIELD* 


BY 
A. ADRIAN ALBERT 


1. Introduction. Let ¢(w) =0 have coefficients in a modular field F of 
characteristic p and be irreducible in F. Then ¢(w) =O and the field F(x) 
generated by any one of its roots x are called separable or inseparable accord- 
ing as ¢(w) =0 has not or has multiple roots. It is well known{ that if ¢(w) =0 
is inseparable, then 


= Liaw (a; in F), 


and that there exist inseparable extensions F(x) of F if and only if some 
quantity a of F is not the pth power of any quantity of F. 

An infinite field F is called perfect if either F is non-modular or every 
quantity of F has the form 8? where ? is the characteristic of F and 8 is 
in F. In any consideration of normal division algebras D over F the prop- 
erty that F is perfect is used only when we consider quantities of D and the 
minimum equations of these quantities. But if the degree of D is not di- 
visible by the characteristic » of F, then the assumption that F is perfect 
evidently has no value and is a needless extremely strong restriction on F. 

In most of the papers on the structure of normal division algebras written 
recently in Germanyf, the assumption has been that F is perfect. But I shall 
prove here that if F is perfect of characteristic p, then m is not divisible by p. 
Hence it is now necessary to consider algebras of degree p* over F of charac- 
teristic p, where F is not perfect. 

I shall give here a brief discussion of the validity of the major results on 
algebras over non-modular fields when F is assumed to be merely any infinite 
field. Moreover, I shall determine all normal division algebras of degree two 
over F of characteristic two, of degree three over F of characteristic three.§ 

2. The existence of a maximal separable sub-field of A. Let A be any 
normal division algebra of degree over any field F, and let 


(1) Um (m = 


* Presented to the Society, December 1, 1933; received by the editors November 22, 1933. 

T Cf. B. L. van der Waerden’s Moderne Algebra for the theory of modular fields. 

¢ In particular the papers by R. Brauer. 

§ I have also completed a determination of all normal division algebras of degree four over F 
of characteristic two and have offered this more complicated determination for publication in the 
American Journal of Mathematics. 

388 


NORMAL DIVISION ALGEBRAS 389 


be a basis of F. Then it is known* that if K is an algebraically closed extension 
of F, the algebra Ax over K is a total matric algebra M. Let 


(2) ap = 05 = Us = (14,7 =1,---,m), 
t=1 


where a, 8=1,---,nandj7=(a—1)n+. The quantities \,;, uj; are then in 
K and eé,, corresponds to an m-rowed matrix with unity in the ath row and 
8th column and zero elsewhere. 

The rank equation of A is the minimum equation of the quantity 
x=)_,_,¢u; where the £; are independent variables. Then it is known that we 
have the result 


THEOREM 1. The rank equation of A is the characteristic equation of the 
matrix 


(3) (a, = 1, n) 
where 
(4) Sas = j=(a—1)n+8. 


This equation has coefficients in L=F(&, - - + , &m) and is irreducible in L. 
E. Noether and G. Kéthe have given proofst of 


THEOREM 2. Algebra A of degree n over an infinite field F has separable sub- 
fields F(x) of degree n. 


Their proofs are not at all elementary while my very much earlier simpler 
proof§ for the case where F is non-modular holds and uses only Theorem 1. 
We may in fact prove 


THEOREM 3. The sub-fields F(x) of Theorem 2 may be so chosen that x satis- 
hes 
+ +--- +A, =0 (Ar ¥ 0, in F). 


For the rank equation R(w; &, ---, &m) is satisfied by any matrix (3) 
when the corresponding values of - - - , are given. Let ---, B, be 
m quantities of the infinite field F so chosen that (i, - - - , 8,1 are distinct 


* Cf. van der Waerden’s Algebra, II, p. 176. 

t For proof of Theorem 1, see L. E. Dickson’s Algebren und ihre Zahlentheorie, pp. 259-262. 
Dickson’s proof uses only (2) and is an immediate consequence of his Theorem 5 without the argu- 
ment of the unnecessary section 132. 

¢ Journal fiir Mathematik, vol. 166 (1932), pp. 182-184, for Kéthe’s proof, and Mathematische 
Zeitschrift, vol. 37 (1933), pp. 514-541, p. 535 for Noether’s proof. 

§ Bulletin of the American Mathematical Society, vol. 36 (1930), pp. 649-650. 


Lead i 

a 

ret 
a 

a 


390 A. A. ALBERT [April 


and 6,~8;, --- +Bn-1) for i=1, - - -, Then we solve (4) for 
the £; and have proved the existence of £9 in K for which R(w; £10, - - - , Emo) 
=0 has distinct roots and the coefficient A; (£10, - - - , Emo) Of w"-! is not zero. 
Let D(é, - - - , &n) be the discriminant of R(w; &, - - - , &m). Then 


D(E10, Emo) (E10, Emo) # 0, 


so that D(é, ---, &m)-A(éi, ---, 40. But then there exist values 
of &,---,&min F such that D(fu, - - - , &mi)-A(Eu, , #0 and hence 
such that the rank equation of A for x=)>£, u; has distinct roots and coef- 
ficient of w*~' not zero. 

The characteristic equation of the corresponding matrix (3) is an exact 
power of the minimum equation of x since x in the division algebra A has ir- 
reducible minimum equation. Since the characteristic equation has been 
shown to have distinct roots, it is the minimum equation of x and we have 
proved Theorems 2, 3. 

3. Known theorems. In this section we shall state certain well known the- 
orems on algebras over non-modular fields which hold for any infinite field. 
We first have 


THeEorEM 4. Let D be a normal division algebra of degree n over F, and let 
Z be equivalent to any sub-field of D of degree n. Then DXZ=Dz is a total 


matric algebra. 


Wedderburn’s proof* of this theorem holds for an arbitrary field. As an 
immediate consequence of Theorem 2 we have 


THEOREM 5. There exist separable splitting fields of D of degree n. 


We of course say that Z is a splitting field of D if Dz is a total matric alge- 
bra. 

We also have Wedderburn’s theorems: 

THEOREM] 6. Let A be a normal simple algebra of degree n* over F. Then 
A=MxXD~D, where M is a total matric algebra and D is a normal division 
algebra whose degree is the index of A. Moreover D and M are uniquely deter- 
mined apart from an interior automorphism of A. 


Txeorem{ 7. Let B be a normal simple algebra over F contained in any 
algebra A over F with the same modulus as B. Then A=BXC where C also has 
the same modulus as A. 


* For Theorems 10, 12, see Wedderburn’s paper in these Transactions, vol. 22 (1921), pp. 129- 
135. The proof of Theorem 4 appears on p. 133 and the footnote to p. 134. 

t Cf. L. E. Dickson’s Algebren, p. 120. 

t Proceedings of the Edinburgh Mathematical Society, vol. 25 (1906-07), pp. 1-3. 


1934] NORMAL DIVISION ALGEBRAS 391 


The proofs given by Wedderburn of the above Theorems 6, 7 also hold in 
view of Theorem 5. They may also be applied, as in the non-modular case, to 
give my 

INDEX REDUCTION THEOREM.* Let D be a normal division algebra of de- 
gree (index) n over any infinite field F, Z an algebraic field of degree r over F. 
Then the index of Dz over Z is 
n'=n/s, 
where the index reduction factor s divides r. 


As a consequence we have the whole Brauer exponent theory as well as 
my 
THEOREM{ 8. Let D be a normal division algebra of degree n over any infinite 
field F, p a prime divisor of n. Then there exists a field Z of degree r over F such 
that 
D=MXB~B (M total matric), 
where B is a cyclic division algebra of degree p over its centrum Z. 


TueorEm{ 9. Let Z, be in D so that the degree r of the field Z, divides n and 
let Z be equivalent to Zo 
Dz =M XB, 
as in the Index Reduction Theorem. Then the algebra Bo over Zy of all quantities 
of D commutative with every quantity of Zo is equivalent to B over Z. 


We may indeed say that almost all of the recent general theory on normal 
division algebras holds when F is any infinite field. The determination the- 
orems on algebras of degree 2, 3, 4 do not hold however. We shall give here a 
determination in the cases n=2, 3, and, in a later American Journal paper, 
the case »=4. We shall require 


THEOREM 10. Let D be a normal division algebra of degree n over F, and let 
x in D have o(w) =0 of degree v as its minimum equation. Then 


o(w) = (w — x,)(w — — — 2), 
where the v factors may be permuted cyclically. 


THEOREM$ 11. Every root y in D of (w) =0 is a transform txt! =y of x by 
tin D. 


* On direct products, these Transactions, vol. 33 (1931), pp. 690-711. 

+ For probably the best proof of Theorem 8 see (1), (2) on p. 725 of the joint paper by H. Hasse 
and myself in these Transactions, vol. 34 (1932), pp. 722-726. 

t On normal simple algebras, these Transactions, vol. 34 (1932), pp. 620-625. 

§ Cf. Annals of Mathematics, vol. 30 (1929), pp. 322-338, Theorem 12. 


392 A. A. ALBERT [April 


THEOREM 12. Let f(w) =g(w) -h(w) where f, g, h have coefficients in D and w is 
a scalar variable. Then if w—x is a right divisor of f(w), h(w) =q(w) (w—x) +R 
where R¥0 is in D, then w—RxR- is a right divisor of g(w): 


4. Algebras over perfect fields. We may now prove 


THEOREM 13. Let D be a normal division algebra of degree n over a perfect 
modular field F of characteristic p. Then n is not divisible by p. 


For by Theorem 8, if m is divisible by p then there exists an extension Z 
of finite degree over F, such that DXZ=M XB where B is a cyclic division 
algebra of degree p over F. But it is known* that then Z is perfect. Moreover 
B=(X, S, y) where.X is cyclic of degree p over Z and with generating auto- 
morphism S, y in Z is not the norm N(f) of any f in X. But Z is perfect, 
= 6” = N(8), a contradiction. 

5. Algebras of degree two. Let D be a normal division algebra of degree 
two over an infinite field F of characteristic two. By Theorem 2, algebra D 
contains a separable quadratic field F(x), x? =Ax+p where \+0, u<0 are in 
F. We let so that (Ax+y) =i+a where a=pr~?0 is in F. 
The equation w* =w+a is cyclic and in fact has the roots 7, +1. By Theorem 
12 there exists a quantity 7 in D such that ji=(¢+1)j. But then 7% =7/?. 
Since F(i) is a maximal sub-field of A, the quantity 7? is in F(z). But 


F(j*) <F(i) since jj? =77j, but Hence 7?=7 in F and we have proved 
THEOREM 14. Every normal division algebra D of degree two over F of char- 
acteristic 2 is a cyclic algebra 


(1,i,7,ij), ®@=it+ea, 
wi=(i+1)j, P=y7, 
with a and y in F. 


6. Algebras of degree three. We now let three be the degree of D and the 
characteristic of F. By Theorem 2 there exists a separable cubic sub-field 
F(u) of F such that u has 


$(w) = w* + aw? + Bw + y = 0, 
with a#0 by Theorem 3. By Theorem 10 we have 
$(w) = (w — u3)(w — u2)(w — 1) 
where u=%%, 2, us are evidently distinct and ue, u; are transforms of u by 


* Cf. E. Steinitz, Algebraische Theorie der Kérper, p. 55. 


1934] NORMAL DIVISION ALGEBRAS 393 


quantities of F. If 
xX = Ugh — 


is zero then evidently ¢(w) is a cyclic equation, D is a cyclic algebra. For 
= implies that is in F(m). 

Hence let x~0. By Wedderburn’s proof for the case where the character- 
istic of F is not three, we have 


XU, = = UzgX, = UX, 


so that x*u, =m and x’ is in F. Let then x*=6 in F. 
The minimum equation of x with respect to F is 


¥(w) = — 6 = w — x)? = 0, 


so that F(x) is inseparable and Wedderburn’s proof breaks down. But let 
= (U1 — U2) Write Then since 
= xu, = —v~0. Hence is a right divisor of Y(w) but not of 
w—x, and, by Theorem 12, with R=mx,uz!—2x,=vuz! we have w—vx,0— a 
right divisor of (w—2,)*. We have obtained 


(w — x1)? = (w? — 2xw + x?) = (w — 4X3)(w — 2x2), = 
Now 
Xe = = (uy — U2) (1 — = — Ue) — Ue) 
But 
— U2) = (te — U3) %1, (U2 — = — 
and 


= (uy — — 


If x2 = then — U2 = Us. But 3u2 =0, — 22 +3 = +2 =0 
a contradiction. Hence Also x3+%2+%1=0, %3+%2=2m, 
= — %3 —X2 = 2(x1 0. Thus 3, x2, x: are all distinct and we have 
obtained a factorization in D of ¥(w) into distinct factors in spite of the fact 
that ¥(w) =0 is inseparable. 

Moreover (w—2;)? =(w—22)? =(w—23)? =(w—21) (w—4%3) (w—22), so that 
(w — 22)? —(w—21)(w—43) and x1%3 = 

If —x1%2=0, then is in F(x), (%2—21)* =x =0, a contra- 
diction. Hence y= —21%2~0. By the Wedderburn proof* 


YXs=nNy, Y=e in F. 


* These Transactions (loc. cit.), 1921. 


| 


394 A. A. ALBERT 


We let 2: = 22 = = ya, a transform of by y. Also yai¥ 
so that Thus 222%: —2122 = — = —x3%1) y? =0. Hence 2 is 
commutative with 2, 22 is in F(z:), z2~z, and F(z;) is cyclic. We have proved 

THEOREM 15. Every normal division algebra of degree three over any infinite 
field F is cyclic. 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


THE VALUE OF THE NUMBER g(k) IN 
WARING’S PROBLEM* 


BY 
R. D. JAMESf 


1. Introduction. The number g(k) is defined to be such that (a) every 
integer is a sum of g(k) kth powers 20; (b) there is at least one integer which 
is not a sum of g(k) —1 kth powers =0. It is well known that g(2) =4, g(3) =9, 
but the exact value of g() is not known when k=4. 

The number G(&) is defined to be such that every integer >C =C(k) is a 
sum of G(k) kth powers 20. Hardy and Littlewood{ have proved that 


G(k) S (k — 


where 


(k — 2) log 2 — log k + log (k — 2) 
=| log & — log (4 — 1) | 
In this paper we obtain a similar bound for g(k) when k26. We shall 
prove the 
THEOREM. Let L be a number >k* such that every integer <L is a sum of 
$3 kth powers =0. Let 
D = (d + 2)(k — 1) — 2414 1/10, d = [log (k — 1)/log 2]; 
3 log k + log 20 — log (log L — k log k) | 
log k — log (k — 1) , 
F = log 2(log k — log — 1))“!; H = (k — 2)2*-2 + ; 
R= (1+ (1 — a)***)k2** — DO. 


E=s3+ 


Then 
g(k) S +FD+0+£+4+ ((A+FD+0Q — EB) 
+ 4F(ED + R))*!*)] +1. 


The method of proof is as follows: We determine the constants as they 
occur at each step of the Hardy-Littlewood analysis as functions of k, s, and 


(1) 


* Presented to the Society, March 18, 1933; received by the editors March 20, 1933. 

t National Research Fellow. 

¢ G. H. Hardy and J. E. Littlewood, Some problems of “‘ partitio numerorum” (V1): Further re- 
searches in Waring’s problem, Mathematische Zeitschrift, vol. 23 (1925), pp. 1-37. See also E. Landau, 
Vorlesungen tiber Zahlentheorie, vol. 1, part 6 (referred to as L), and M. Gelbcke, Zum Waringschen 
Problem, Mathematische Annalen, vol. 105 (1931), pp. 637-652 (referred to as G). 


395 


ay 


| 


396 R. D. JAMES [April 


¢, where ¢€ is a small positive number. In this way we conclude that every 
integer >C(k, s, €) is a sum of s kth powers =O when s2g,(k, €) (Theorem 
46). Then, using a theorem proved by L. E. Dickson,* we show that every 
integer <C(k, s, €) is a sum of s kth powers=0 when s2=g.(k, €) (Theorem 
50). We choose as a function of k so that g,(k, e(k)) =ga(k, €(k)) and then 
g(k) Sgilk, €(k)) =g2(k, e(k)). 

In Theorem 48 we give a general method for the determination of L 
and s; and prove that 


. 2 3 3 2 9 


when L=(k+1)*—k*>&*. It then follows from (1) that 


For particular values of k we may obtain better values of Z and s;. For 
example, since 25-28 =6400, 3° =6561, 26-2*=6656, every integer from 6400 
to 6656 is a sum of 185 8th powers=0. Repeated application of Theorem 47 
yields the result that every integer from 1 to 107%*7 is a sum of 279 8th 
powers=0. With L=10"*7, s,=279, we get g(8)<622. Again, since 25-28 
+9-3%=65449, 4%=65536, 10-3°=65610, 26-2°+9-38=65705, every in- 
teger from 65449 to 65705 is a sum of 120 8th powers20 and this gives 
L = 103-90,000 5, =279, g(8) <595. It is obvious that the larger we can make 
L for a given s; the better will be the resulting bound for g(z). In the table 
below we summarize the known results for g(k) and G(k) when 6S 510. 
The first line gives the bounds for g(#) obtained by algebraic methods sepa- 
rately for each k; the second gives the bounds obtained by the methods of 
this paper; the third gives the bounds for G(k) obtained by the Hardy- 
Littlewood method; and the fourth gives the lower bounds for g(k). 


10 


140004 


2421 


2113 


279 1079 


* L. E. Dickson, Proof of a Waring theorem on fifth powers, Bulletin of the American Mathe- 
matical Society, vol. 37 (1931), pp. 549-553. 


k 6 7 8 8 m 
g(k)< 478 3806 31353 
g(k) 183 j 322 595 1177 || 
G(k) <= 87 193 425 949 7 
g(k)= 73 143 


1934] WARING’S PROBLEM 397 


The numbers in the last line are probably the exact values of g(k). In 
order to prove g(10) = 1079, for example, it would be necessary to prove some 
inequality like G(10) <700. On the basis of an unproved hypothesis, Hardy 
and Littlewood (loc. cit.) have shown that G(10) would be <21. It seems 
likely, then, that a far less drastic assumption would be sufficient to prove 
g(10) = 1079 and this assumption may be capable of proof. 

The possibility of evaluating the constants of the Hardy-Littlewood 
analysis was suggested by Professor L. E. Dickson. The case of fifth powers 
was considered in the author’s doctor’s dissertation written under Professor 
Dickson’s direction at the University of Chicago. 

2. Notation. We shall use the following notation throughout the paper. 
Let 

T(m) =the number of divisors of m; 
a(x) =the number of primes <x; 
0(x) =the sum of the logarithms of all primes Sx; 
[x] =the greatest integer <x; 
{x} =min (x—[x], [x]+1—x); 
M(p‘, m) =the number of solutions of the congruence =n (mod 
N(p‘, n) =the number of solutions of the same congruence in which not 
every h; is divisible by p (primitive solutions) ; 
k=an integer >6;a=1/k; K=2*"; A =1/K; 
€;=a small positive number, 7 = 1, 2, 3; 7; =1/e;,7=1, 2, 3; 
(1—De,) +] +3; 
(2) S2= [((k—2) log 2—log k+log (k —2)) (log k—log (k —1))*] +4; 
A=2A+(1—a)"e,; 
© =the highest power of a prime p which divides k; 
_ [042 if p=2, ». 
(a, 6) =the greatest common divisor of a and 5; 

r=((P—1)/(p—1)) (&, p—1); 

r(n) =rx,.(”) =the number of solutions of =n, h;=0; 
(6, q) =1 


S, =) 1p"; 
A(q) =Ax..(9, 7) where p ranges over all primitive 
qth roots of unity; 


Xp (p*); 
Sj, k, 5, w) = (9); 
f(x) 
5_1(a(a+1) - - (@+j—1)/j!) 


4 


398 R. D. JAMES 
(2) V,(x) =y,(x) +¢,(x) r(i +a)qS,(1 —x/p)-*; 
o(j) =r(j) —(T*(1+a)/T(sa)) k, s, 


The letters A, a, b, B, c, C are numbered in the same way as the correspond- 
ing letters in paper L, while the letters G correspond to the C of paper G. 

3. Preliminary theorems. We shall not repeat the proof of a known 
theorem if the constants involved are explicitly given in the original proof. 


THEOREM 1.* For every €,>0 


T(m) =< A 1m", 
where 


exp + 0((3/2)") + 0((4/3)")+ ---)) 
THEOREM 2. (L, Theorem 112.) For 22, 
ait/log < w(t) < ast/log 
where 8a, =log 2 and az: $7 log 2. 


Since 


A,= 


[n] — 2[n/2] <1 — 1) =2 
and the left side is an integer, it follows that 
(3) [n] — 2[n/2] 1. 


Let »=2. For every prime p<2n let f denote the greatest integer such that 
p! S2n (i.e., f = [log (2n)/log p]). We show first that 
(2m)! 
(4) II II 
n<pS2n PS2n 
The first part of (4) follows at once since every p for which n <p <2n divides 
(2m)! but not !2!. Also, since the highest power of a prime p which divides x! 
is 
1S mS logz/logp 
(see, for example, L, Theorem 27), the highest power of p which divides 
(2n)!/(n! n!) is 


* S. Ramanujan, Highly composite numbers, Proceedings of the London Mathematical Society, 
(2), vol. 14 (1915), p. 392. 


1934] WARING’S PROBLEM 399 


by (3). This proves the second part of (4). Next, the left side of (4) has 
a(2n) —2(n) factors each >n, and the right side has r(2m) factors each S2n. 
Hence 
< TT ps < S (2n)@, 
n<pS2n pS2n 


(x(2n) — w(m))log < log ((2m)!/(m!n!)) S log (2m). 


(2n)! 


Therefore 
2n 2n /2n 
(w(2n) — w(n)) logn < toe(( )) < log ( >( = log 2?" = 2m log 2, 
n \ J 
(5) n(2n) — x(n) < 2(log 2)n/log n = asn/log n; 
and 


m(2n) log (2n) = = log( 
n j=l Jj 


2 log ( TI 2) = log 2" = n log 2, 


(6) a(2n) = n log Wie (2m) = n log 2/(log m + log 2) 
= n log 2/(2 log m) = ag n/log n. 
From (6) when 24 
w(t) 4(2[&/2]) = au[E/2]/log [E/2] auk/(4 log = aast/log 
When 2<£<4 we have 
w(t) 2 1 = ((log 2)/4)(4/log 2) = ((log 2)/4)(E/log £). 


Hence for all ¢=2, 
> ont/log &, 
where 8a;28 min (as, (log 2)/4) = 2a,=log 2. This proves the first inequality 
of the theorem. 
Now, since 7=2+2(n/2—1)<2+2[n/2] it follows from (5) when 728 
that 


x(n) — (n/2) = — 4([n/2]) 2 + 4(2[n/2]) — ([n/2)) 
< 2+ as[n/2]/log [n/2] < 2 + asn/(2 log (n/2 — 1)) 
= 2 + as/(2(log (n — 2) — log 2)) 
= 2 + (as log 7/(2(log (n — 2) — log 2)))(n/log n) 
S (log 8/4)(8/log 8) + (as log 8/(2(log 6 — log 2)))(n/log n) 
< ((log 8/4) + (as log 8/(2 log 3)))(n/log 1). 


4 


400 R. D. JAMES 


When 2 <7 <8 we have 
r(n) — m(n/2) S 2 = (log 8)2/log 8 < (log 8)n/log n. 


Therefore for all 7 =2 
— r(n/2) S arn/log 


where a; = max ((log 8/4) +(as log 8/(2 log 3)), log 8) =log 8. Then 
log — log (n/2) = (x(n) — w(n/2)) log + log 2 
< a7(log 7)n/log 7 + n(log 2)/2 = (a7 + (log 2)/2)n = agn. 
For £22 we have 


m(E/2™) log (£/2™) — w(E/2™*1) log (E/2™*") < ast/2™, 


log = D> (w(E/2™) log (E/2™) — (E/2™*") log 


< as (¢/2") = 2ast; 


that is, 
/ log é, 


where a S = 2(a7+ (log 2)/2) =2(log 8+ (log 2)/2) =7 log 2. 
THEOREM 3.* We have 
< 6cx/5 + 3 log?x+8logx+5, 
(x) > cx — 12cx/2/5 — 3 log? x/2 — 13 log x — 15 
where = 21/2. 34/8. 51/5/301/30 = 0.92129 ---. 
THEoREM 4. (L, Theorem 264.) Let t be an integer, m>0, 220, and 


S= >» 
het+1 


Then 
< > min (m, 1/ - in-s})). 


THEOREM 5. The number of solutions of the equation 

is at most Azm* where Az= and is given by 

(8) €2(2) = 0; = (k — 1)er + €2(k/2) + 1), even 4; 
€o(3) = 261; = (k — + + 1)/2), kodd 2 S. 

ay te Handbuch der Lehre von der Verteilung der Primzahlen, §22. 


[April 
m=0 


1934] WARING’S PROBLEM 401 | 


(i) Let k=2. Then 4i:=v has at most 1=A?—’m°=A.m® solutions. 
(ii) Let k=3. Then h,h42=v has at most T(v) solutions since 4, must divide ». 
By Theorem 1, 


T(v) S Ayo* S = Aym*® 


since v = h,h, Sm’. (iii) Let k be even=4 and assume that the theorem is true 

for all integers<k. In equation (7) write 

There are at most A*/?-*m(*/2) solutions of (9) and at most A ¥/2+1—2yye(k/2+1) ‘e 


solutions of (10). The equation » =2,72 has at most T(v) solutions. Hence the 
number of solutions of (7) is 


< T(v) + A ¢2(k/2) 
A 


(iv) Let k be odd=5 and assume that the theorem is true for all integers <k. 
As in the proof of (iii) with 


hihe--- = 11, 
the number of solutions of (7) is 


< A = Ager), 
Coroiarvy. Let d= [log (k—1)/log 2]. Then 
(11) eo(k) = ((d + 2)(k — 1) — 24**)e;. 
(i) Let k=2 and hence d=0. Then by Theorem 5 
€2(2) = 0 = ((d + 2)(k — 1) — 24% Y)e,. 


(ii) Let k>2 and hence d>0. We have 2¢+1<k<2#. Assume that 
(11) is true for all integers <2¢+1. If kis even and <2¢+ we have 


2-141 < k/2 < 24-1, 
2142 < (k+ 2)/2 < 


By (8) 


| 
i 
| 


402 R. D. JAMES 


= (k — + ((d + 1)(k/2 — 1) — 2%), + ((d + 1)(k/2) — 
= ((d + — 1) — 
If k=24+' we have k/2=24, (k+4)/4=24"+1 and 
€2(k) = — + €2(k/2) + €2(k/2 + 1) 
= (hk — + + (k/2)er + 2e2((& + 4)/4) 
= (k — + ((d + 1)(k/2 — 1) — + 24€1 + 2((d + 
= ((d + 2)(& — 1) — 2%*)e,. 
If k is odd then 24"+1 <(k+1)/2524 and we have 
eo(k) = (k — + 2e2((& + 1)/2) 
= (k — 1)e, + 2((d + 1)((k + 1)/2 — 1) — 24) 
= ((d + 2)(k — 1) — 24*")e. 
THEOREM 6. (L, Theorem 266.) Under the hypotheses of Theorem 4, 


| Cism* +mk-* min (m, 
vel 
where Cs=4* Ax. 
In the summation in Theorem 4 write k!h,h2 - - - 4-1 =v. By Theorem 5 
each v appears at most A2m* times. Hence 


> min(m, 1/{zk!hi--~ he-r}) S min (m, 1/{20}), 


vel 


|s|*¥<4 + min (m, 1/{})) 


vel 


kim*-1 


v=l 


THEOREM 7. Let x21, b>0. Then for every e,>0, 
b + log x S A3x*, 
where 
Consider the function 
y = (6 + log x)x-, y’ = (1 — + log x)) 
= — + €3(1 + €3)(b + log x)) 


1934] WARING’S PROBLEM 403 


We have y’=0 when x= or b+log x=1/es. The second value gives a 


maximum. Hence 
max y = 


b + log x S x*#/(ese!—5#3) = Agxts. 
4. The singular series. The series 
S =S(j, ks, ©) = 
q=1 


is called the singular series. In this section we show that © =][,x, and from 
this that S >d,. We shall follow closely part 6, chapter 2, of paper L. 


THEOREM 8. (L, Theorem 293.) Let n=mop**+?+0, where B20, OS0<k, 
(mo, p) =1. Let =max (Bk+o+1, Bk+~y). Then 


(12) A(p') = 0 when t > bo, 


(13) Xp = PN (P, 0) +. Pl-*N(P, n/ per), 


a=0 
where the summation is omitted if B =0. 


Remark. This theorem shows that the terms of the series x, => 0A (p*) 
are all zero after a certain one. Also, since everything on the right side of (13) 
is positive or zero it follows that x,20. If p/2kn we have y=1, B=c=0, 
4 =1 and thus x,=1+A(p). 


THEOREM 9. (L, Theorem 301.) For s=r and every n#0 (mod p), 
N(P, n) > 0; 
for s=r+1 and every n, 
N(P, > 0. 
For s=r-+-1 and every n, 
N(P, n) = Pert, 


In the congruence 


(14) hf +hF +---+ ht =n (mod P) 
write nm, =n—hi,.— --- —h,. By Theorem 9 the congruence 

hE +h +---+hhi=n (mod P) 
has at least one primitive solution. Since h,,2,---, 4, may be chosen ar- 


bitrarily mod P it follows that (14) has at least P*-' primitive solutions. 


ey 
| 
bi 


404 R. D. JAMES 
THEOREM 10. (L, Theorem 302.) Jf k2=5, s=4k, then 


Xp P-’ = 
For k25, 


| k 
e+2 — = 
| (2 1) < 4k, p = 2, 


py —1/k 

< 2h, p > 2. 


Hence s24k implies s=2r+1 and from the Corollary to Theorem 9 we get 
N(P, n)=P*~-. By Theorem 8 either x,=P!“"N(P, n) (8=0) or xp 
>P'-*N(P, 0) (8>0). In both cases it follows that 


Xp 2 Perr! = P+ = 
THEOREM 11. (L, Theorem 307.) Jf g=p', p/k, 2StSk, then 
S, = pr, 
THEOREM 12. (L, Theorem 311.) If g =p, then 
|S,| — 1)p"?. 
TuEorem 13. (L, Theorem 313.) Let T,=q*S,. Then if g=p', t>k, 
T, = 
THEOREM 14. (L, Theorem 314.) If g=p', 121, then 


| 7, 


C39 always, 
where C33 = C39 = k. 


If p =e?" (b, p) =1isa primitive p‘th root of unity, then p* 
is a primitive p‘~*th root of unity. Hence in view of Theorem 13 we may as- 
sume 1</<k. 

(i) If p}k, 2St<Sk, then by Theorem 11 


|7,| = = = pe 1. 
(ii) If p}k, t=1, Theorem 12 gives 
|T,| = pe |S,| — 1)p'? < 
(iii) If p|k, then 
|7,| = pt = pk. 


1934] WARING’S PROBLEM 405 


It follows from (i), (ii), and (iii) that | T, | <1 when p>max (k?*/(*-®, k) =Cos 
and for all » we have | T, | Smax (1, k, k) =k =cyo. 


THEOREM 15. (L, Theorem 315.) We have 
|T,| < cs and hence |S,| < csq'-, “ 
where log ¢i=(k—1) log /(2k). 
When (91, g2) =1 we have S,,,,=S,,5,, (L, Theorem 281). Therefore 
For g>1 write - - - pi” =[]p.*. Then 


where II, contains those p;for which p;|& and p;<css, contains those for 
which ;/k and p;<¢ss, and II; contains those for which p;>c3s. By the proof 
of Theorem 14 we have ‘ i 


psk 

I] 1=1; 
p>c38 

II! = II 


log log k = r(csg) log & < log k/log (Theorem 2) 


PScz3g 


= — 
Hence when g>1 
log | T,| = log]: + + log] Is 
< (k — 1) log k + ao(k — 2)k?*/(*-2/(2k) = log cus. 
If g=1 then |T,|=1<cis. 
THEOREM 16. (L, Theorem 316.) We have 


| A(q)| < 
where by 


We use Theorem 15: 
|A(@)| S < = bisg'**. 


THEOREM 17. (L, Theorem 317.) If p/n then 
| A(p)| < 


where be =k*. 


p 
| 


406 R. D. JAMES 
THEOREM 18. (L, Theorem 318.) We have 


|A(p)| < bap, 
where be; =max (bao, b22) =k. 


If p/n Theorem 17 gives 


| A(p)| < < 
If p|m it follows from Theorem 12 that 
< = 


THEOREM 19. (L, Theorem 319.) For s2=4, 


S 4@ 


converges absoluiely and 


S II Xp- 
Pp 
THEOREM 20. (L, Theorem 320.) If p/n then 


|x» 1| < 


where bos = max (deo, bes) = bap 
(i) Let p}(2k). By the remark after Theorem 8 and by Theorem 17, 


|x» — 1] = |A(p)| < 


(ii) Let p| (2k). Then since p/n wehave 6 =0 and &= max (1,7) Sk. By (12) 


to k a 
Ix» —1| = D (2k)! 


t=1 


(2k)* — 1 
2k— 1 


pk k+l Bk 
(2k) "2-8/2 (2k)¥2-#/2 
= Dog 


< 


1934) WARING’S PROBLEM 407 


THEOREM 21. (L, Theorem 220.) Let D,(m) denote the sum of the mth 
powers of all primitive qth roots of unity. Then 


D,(m) = d-u(q/d), 


al (g,m) 4 
where 
1 ifn=1, 
y(n) = (—1)/ if m is the product of j distinct primes, 
0 otherwise. 


CoroLiaRry. If p is a prime then 


pt — if p'|m, 
D,+(m) = — pt if p!|m, pttm, 


0 if 


Proof: Dy:(m) =) a\cpt,m) d-u(p'/d) and p(p*/d) =1 if d=p', u(pt/d) = —1 
if d= p*", and u(p‘/d) =0 in all other cases. 


THEOREM 22. (L, Theorem 321.) If p*}m then 
|x» — 1] < basp-*?, 
where be, =max (be3, beg, bez) = 
If p/n Theorem 20 gives 
lx» — 1| < < 


Hence we may assume p|m and since p*in we have B=0, 1Sa0Sk—-1. 
(i) Let p}(2k). Then y =1, (0 +1, 1) =0+1 and 


o+1 


(15) —1=A(p) + A(p'). 


t=—2 
Also, by Theorem 11, S,=p*! since 2<¢So+1<k and then from the 
Corollary to Theorem 21 we obtain 


n) 
p' — when p*|, that is, when 2 ¢ So, 
(16) =p — p*! when pt '|n, pin, that is, when ¢ = o + 1, 
0 when p‘-!}m, which does not occur. 


Therefore from (15), (16), and Theorem 18 we get 


p 
q 


R. D. JAMES 


o+1 


lx» — 1| 4@+ | 


= | A(p) < + pi 
< (ber + = 
(ii) Let p |(2k). Then & =max (o+1, y) Sk and 


k 
= 1| < (2k)* < < 


t=1 
THEOREM 23. (L, Theorem 322.) We have 
Xp > 1 bogp!-*/?, 
where bos =max (bes, bao) 


If p*}n, Theorem 22 gives x,>1—bosp'-*/*. Hence let p*|n, that is, 
B>O. (i) Let p>k so that y=1, P=p. Applying Theorem 8 twice we get 


Xp = p'*N(p, 0) = p'*N(p, = 
Since we have 
Xp = > 1 — 
by Theorem 22. (ii) Let p<k. Then 
Xp = 0 => 1— boop! */2, 
THeoreEM 24. (L, Theorem 324.) Jf =bis then 
Xp >1— 
By Theorem 23 
Xp > 1 — = 1 — (1 + > 1 — 
when p> = dys. 
THEOREM 25. (L, Theorems 325-326.) We have 
S>b 


b= II TI] 


ps bis p> bis 
We use Theorems 19, 10, and 24: 


II] x-iIlx> II 


psbis p> bis psbis p> bis 


where 


408 [April 


1934] WARING’S PROBLEM 409 


5. The main lemma for the third Hardy-Littlewood theorem. In this sec- 
tion we follow the methods of paper G. We shall assume that 


(17) ™ = 17 when k = 6, 
m = D-+ 2*-* when 2 7, 
so that from (2) 
(18) (k — 5 < 4(k — 2)2*-?+ 4k. 
As we shall see later our final choice of 7; satisfies the conditions (17). It 
follows from the second part of (18) that 


Z-1 1 
K 2—K K(2s—K) ~K* 
2s—2k 2s—2K 2s(K — k) 


2—-k %-K (2s—k)\2s—K) 


We may then choose @=0(k, s, €) so that 


2s — 2k 1 
(19) -d2—, 
2s —k 2K? 

(20) 
K ~ 2K? 

2s — 2K 1 
2s —K 


The purpose of this section is to find an approximation for 


| - dX vs(x) | | 


taken around the unit circle |x|=1. We divide the circumference into sub- 
arcs in the following manner. On the circle we take the points p = e***/* which 
correspond to the Farey fractions * with denominators g <”'-*. The mediants 
between two neighboring Farey fractions form the end points of our sub- 
arcs. It is known that if x is any point of an arc which contains the point p 
then 


* For the definition and properties of Farey fractions see L, pp. 98-100. 


| 


R. D. JAMES 


1/(2qn'-*) yi < 1/(qn'-*),  1/(2gn'-*) S y2 < 
The arcs for which n*<q<n'~ are called minor arcs and are denoted by m; 
those for which 1<q<m* are called major arcs and denoted by M. Each 
major arc is further divided into two sub-arcs denoted by M, when |y| 
<1/(2q’n'-%), and by Mz when 


THEOREM 26. (G, Theorem 1; L, Theorem 140.) Let |y|<$}, a2a 
=--- 20. Then 


N 
Dd | < ao/sin 


THEOREM 27. (G, Theorem 2; L, Theorem 223.) 


1/2 N 2 N 
| ay = Dal’ 


1/2 | j=0 j=0 


THEOREM 28. (G, Theorem 3; L, Theorem 262.) 


N 
> < 


j=0 
where 


If k is even r;,2(7) is at most equal to the number of solutions of 4? +h? =j 
and this is 
4>> (— 4T(j) < 4A jt! 


uli 
u odd 


by Theorem 1. If & is odd, 
j= ht + hd = (hy + + + 
implies that 4,+/: divides 7. For each positive divisor d of 7 the two equations 
ht +h? =] 


have at most k—1 solutions in common since the elimination of 4, between 
them yields an equation of degree k —1. Therefore when k is odd 


S (k — 1)T(Z) S (Rk — 
Thus for all k=6 
S max (4, k — 1)Ayj*t = (k — 1)Ayj". 


410 [April 
where 


1934] WARING’S PROBLEM 411 


Next, rx,2(j) is the number of solutions of h, 20, h220. 
Since h, S N*, he < N*, this is at most (1+N*) (1+N*) <4N™. Finally 


N N 
x max rz,2(j)- >> re.2(j) S (Rk — 1)A1N-4N™* = 


i=0 
THEOREM 29. (G, Theorem 4; L, Theorem 277.) For 8>0 and j a positive | 
integer, 
where y(B) is independent of 7. 
It is known that* #,| 
lim T(6 + 1+ = 1. 


Let Then 


j+1 


(22) @(v — 1) — &(v) = — 1)(1 — (1 + B/v)(1 — 1/0)). 


If 8 is an integer this expression is 


+ 1] ot i+1/ # 


Next, suppose that @ is not an integer. Since 


sunt 


* See, for example, Whittaker and Watson, Modern Analysis, Chapter XII. 


Also, 


R. D. JAMES 


der 


when ¢= [8]+1, and since 


B+1\ @+1)8--- 
12 


we have 


| 

2 3 
t + 1/ [8] + 2 pith 


te 


< (29-20 + 28 (2B-26+1 + 
v — 1/2’? 
Hence 
(23) — (1+ 8/s)(1 — 1/v)*| < max (8-261, 48(28 + = 
Also, 
log &(v — 1) = log T(8 + v) — log ((v — 1)!) — B log (v — 1) 


v1 
log ( (8 + 1») — — B log (v—1) 


n=l 


(log (8 + n) — log m) + log T(6 + 1) — B log (v — 1) 


n=1 


v1 
(Gn-") + log + 1) — B log (v — 1) 


n=l 
sa+ef u-'du + log T'(8 + 1) — B log (v — 1) 
1 


= B+ 8B log (v — 1) + log '(8 + 1) — B log (» — 1) 
= 6B + log + 1). 

Therefore 

(24) — 1) S + 1) = ¥2(8). 


412 [April 


1934] WARING’S PROBLEM 413 


From (22), (23), and (24) we get 
| — 1) — = 


(#@ 1) - 


vm j+1 


< 7(8) v*s + 1)-*+ < 7(6)j"'. 
i+1 


va j+1 


+ 1+ —1| = | = 


CorOLLarY. For B>0 and j an integer =1, 
+ 1+ j)/j! > vaj*, 
where 
Since =(@+1+7) I'(6+1+/) it follows that T'(6+2+)/j! 
if Hence we may assume 
(i) Let 721+7(1). Then 
+ 1+ j)/j! > 7% — > — 
= + y(1))-* = 
(ii) Let 157<1+7(1). Then 
= (6+ (B+ + 2)/j! > j1T(2)/j! = 1 
= (1 + y(1))(1 + > + 
= + y(1))—* = va7*. 


THEOREM 30. (G, Theorem 5; L, Theorem 267.) Let t be an integer, m>0, 
and 


Then for every 
| < + + mE-*q), 
where Gs=Cig=8k! CisAs. 
By Theorem 6 with z=b/g we have 
(25) < Cysm* (mx + min (m, 1/{00/4})). 


vel 


Divide the summation into partial sums according to the j in 


t+m 
S= p*. 
het+1 


R. D. JAMES 
bv = j (mod q), O<jSq-1. 
Since {b,/q} = {b2/q} when b=: (mod g) we have 
min (m, 1/{b0/q}) = min (m, 1/{j/q}). 


Each partial sum has at most k! m*-!g-'+-1 terms and thus 


kim*- 


min (m, 1/{b0/q}) + 1) min (m, 


vel 7=0 


Also 


F min (m,1/{j/q}) Sm Sm+2 (1/{j/q}) 


j=0 


= m+2 <m+2 Daj S m+ + log 9) 


1S iS 4/2 
< m + 2qA3q* (Theorem 7). 


Therefore 


kim*-1 
DX min (m, 1/{b0/q}) < + 1)(m + 


S + 1)(m + g) 
< + + 
Combining this result with (25) we obtain 
| << + mE-*. 4k! + m*! + 
S 4k! C 5A + + + 
< 8k!C15A + + m*-*q). 
We choose ¢; = €;/(10%) so that e2+ke;= De. 
THEOREM 31. (G, Theorem 6.) On the entire circle |x| =1 
< 
where Gs (1—a?)*+(2a+1) (a+1)-). 
We use Theorems 15 and 29: 


j=1 


< (ra +a)+a + y(a)j*"')(a 


1934] WARING’S PROBLEM 415 


(ra + y(a))(a + 


+ f "Get + y(a)j2-*)dj )) 
< + a) + a((it+ y(a))(a + 1)-! + n2a-! + y(a)(1 — 


= Gyn*q-*. 
THEOREM 32. (G, Theorem 7.) We have 


where G; =}ac(1+y(a)), 
(i) By Theorems 15, 26, and 29, 


j=n+l 
S + a)-a(a + 1)--- (a+ + 1)! sin y|) 
= + + a + n)/n!)(sin | 
< + y(a)n2-1)(2| 
S $acis(1 + y|-!. 
(ii) From Theorem 15 it follows that 
|w,(x)| = + a)gtS,(1 — x/p)| 
= + -|1 — 
= + - |e” — 
<T(1+ a)cisq~2| 2 sin 
< + a)crsq-*(4| y|)-* < y|-*. 
THEOREM 33. (G, Theorem 8.) We have 
|¥(x)| <Gaq-* min (n*, | y|-*), 


where Gy =max (Gs, Gi:+Gs). 
(i) Let | y| <1/n. By Theorem 31 


| ¥,(x)| < Gsn%q-* = Gsq~*min (n°, | y| 


| 


416 R. D. JAMES 


(ii) Let |y|21/n. From Theorem 32 we get 
= | y|-* + y|-* 
= G, + Gs)q-* min (n°, | y|-*). 
THEOREM 34. (G, Theorem 9.) We have 


c-M, 


=62sa—1— (2sa—1)6 2, 
2sa — 1 2sa — 2 — (2sa —1)0 


Gio = 


The integral is taken around the entire circle with the exception of the arc M, 
itself and the summation extends over all M,-arcs. 


From Theorem 33 we get 


f | |?|dz| < 2G f y-*edy 
1 


c-M, 


The exponent of g is <—2 since 0<(2sa—2)/(2sa—1) by (19). Also, for 
each g there are at most g arcs. Hence 


f | |2|da| 


1 c—M, 


is qSne 


a+(2ea—1 dq ) 


< — +4 (25a — 2 — (2sa — 1)8)—*) 


= (220-1) 60 


1 


In the exponent of m we have 
(2sa — 1)6a > ((2s — k)/k?)((2s — 2K)/(2s — K)) (by (21)) 
= ((2s — k)/k*)(((k — 2)K — 2K)/((k — 2)K —K)) (by (2)) 
= ((2s — k)/k*)(\k — 4)/(k — 3)) 


i 
[April 
M, 
where 


1934] WARING’S PROBLEM 417 


((k — 2)K + 2k + (1 + (1 — a)**"*)kKes)(k — 4) 


k(k — 3) 
> 2a + 2A + (1 — a)**-%e, = 204 +X. 
Therefore G 6a < G 


THEOREM 35. (G, Theorem 10.) On m we have 
| f(x)| < 
where 
Let 7(j) =>} 0", 720, 7(—1) =0. Then 


— (x/p)**+*) + r(n)(x/p)" 
n—1 
= (1 — x/p) Dor(j)(x/p)i + r(m)(x/p)*. 
7=0 


By Theorem 30 with m= [j*]+1, #=—1, 

js K 

h=0 


+ ([j2] + + + 
S + + 
S + (2n2)Kq-! + 


= 


< (1-2) «3(2K—-1 + + 2K—k)yaK—o < q s ni-e on m) 
< (e2 + — les < + hes = Dey); 


Also, 
|1 — x/p| = |1 — = |extv — = 2|sin ry| < 2n|y| < 2x/(qn'-*). 


Therefore from (26) and (27) 
n—1 


< (2nn/(qn'-*) + <4) 


4 

+ 


418 R. D. JAMES 


THEOREM 36. (G, Theorem 11.) We have 
f | f*(x)| a| dx| < | 


where = 
By Theorem 35, 


(28) 2s—4y, (28—4) a— (28—4) (1—De1) aA 2 
< 1 | Ady. 


-12 
But f°(x) =>. j-0R(j)x!, where 
R(j) = re,2(J), OSjSn; 
0 <= RY) nm<j Sdn; 
and by Theorems 27 and 28, 


1/2 2n 2n 
f = XRG) < 


1/2 j=0 j=0 


Combining this with (28) we get 
| f*(x)| dx| < G (2s—4)a—(2s—4) (1—Dex) | 


The exponent of m equals 
2sa — 1 — ((2s — 4)(1 — Des)aA —1— 2a) + & 
= 2sa— 1 — (2s — 4)(1 — De.) — (k — 2)K — 2k 
-G40- — 2A — (1 — 

S 2sa — 1 — 2A — (1 — a)**e, (by (2)) 
=2sa—1-—Xz 

and the theorem follows. 

THEOREM 37. (G, Theorem 12.) We have 
| f(x) — < (n| y|, 1), 


where Gis=(24+1) (2(3G,)4+7/(a)). 
As before 


[April 


1934] WARING’S PROBLEM 419 


(29) f(x) = (1 — + 


j=0 


j=0 


+ (M1 0(2/0)") 


(30) = (1 — 2/p) 0(j)(x/p)! + v(n)(2/p). 


j=0 
Each r(j) has [j*]+1 terms and so may be written as [j*/¢] partial sums 
each equal to S, and [j*]+1-— [j*/q]q<gq further terms. Then by Theorem 
30 with m<q, 
— < + + gh 
< + < €2 + kes = De); 
| 74) — Ge/a)So| = — Lie/a]5,| + 
(31) < 
Since by Theorem 29 
we have 
(32) | — = + + — < 
From (31) and (32) it follows that 
— = |7G) — + — 06)| 

< |7(4) — + | — | 

< 4440 + (a) 

(2(3G,)4 + 
Then from (29) and (30) 

n—1 
| f(x) — =} (1 — — + — 0())(x/p)* 


< (2xn| y| + 1)(2(3G,)4 + 
< + 1)(2(3G,)4 + (n| |, 1). 


420 R. D. JAMES 


THEOREM 38. (G, Theorem 13.) We have 


f | f*(x) — dx] < Gee + 
M, YM, 


», (25 + 2)(2s + 3 — (25 + — — De)A) 
(2s + 1)(2s + 2 — (2s + 1)0 — 2s(1 — De;)A) 


2sa — 2a — 2)(2sa — 3 + 2(1 — De,)A 
Ges Gs ( Sa a Sa ( 1) ) 
(2sa — 2a — 3)(2sa — 4+ 2(1 — De;)A) 


Write ,(x) =f(x)—y,(x). Then 
| — vor(x)|* = | + — 
= | (x) (2)? 
< | &,(x)| 6,(x)| + |v,(x)| 
= 2%(| &,(x)|%* + | 
where by Theorem 37. 
| ,(x)| < max (m| y|, 1). 


Hence from Theorems 37 and 33 we get 
| f*(x) — dx| 
Mi 
M, M, 


I/n 1/( 
< dy f dy) 


0 I/n 


I/n 


0 
nf 
1 


/n 


< | (20+ 1) (19a) (20+ 10 + 1)-*) 


2e+1)0a—1,2 2e+1)0—22(1—D —1— = 
«< q s—(2e+1) s( + 1 2ea—2(1 | 


[April 
where 


1934] WARING’S PROBLEM 421 


where 
Gap = 2*+1G24(2s + 2)(2s + 
= — 2a — 2)(2sa — 2a — 3)-*. 


The exponent of g in the first term is 


> 2s — (2s + 1)0 — 2sA = (1—0—A)2s-—0>-—-0>-—2 (by (20)). 
The exponent of g in the second term is <2+2a—2sa< —2 since 2s >4%+2. 
Thus 
M, M, q=l 
na 


na 
(28+1)6—22(1—De 


< + f 
1 


(1 + f ) 
x (1 + (2s + 2 — (2s + 1)6 — 2s(1 — De,)A)—) 
+ 4+ (2sa + 2a + 4 — 2(1 — 
< (28 (1—De1) aA—2a) + | 
In the exponents of m we have 
2s(1 — De,)aA — 2a > (2s — 4)(1 — Des)aA — 1+ 2a = 2A + (1 — 
as in the proof of Theorem 36; and 
2sa — 1 — 2a S 2sa — 1 — 2A — (1 — a)***e, = 28a — 1 — X. 
This completes the proof. 
THEOREM 39. (G, Theorem 14.) On Mz 
< y|-4, 
where = (241) 21424 4G, 4, 


On we have 


x= | i< q < n°; 


< |y| < 1/(qn*), —-2/(q|y|) > 1. 


| | 


422 R. D. JAMES [April 


From the theory of Farey fractions* it is known that there exist integers }, 
and g; and a number y, such that 


b/atu=/gty, Sq 5 2/(Qly|), (649) =1; 


(33) | < 
It follows from (33) that | y,|<1/(2q.n'-*) and so we cannot have n*<q 


<n'~, for if this were the case x would be a point of a minor arc m. Also, 
1 <q, is impossible when n >2*, since otherwise 


| bg. — big) = — gil y| + < gaily] + 
= (¢ + < 2m*%q/(qn'-*) = < 2n-* < 1, 
and therefore bg, —b.g=0. Since (6, g) =1, (61, g:) =1, we have gi=q, 
which contradicts (33). Hence, 
Write 71(7) => 


(34) f(x) x/p1) > 71(j)(x/p)? + 71(n)(x/p1)". 


By Theorem 30 with g=q., m=[j*|]+1, t= —1, 
< +4 2% +4 
29-1 ly |-*; 
(35) | < (ert AG y|—A4 | 


Also, 
— x/pi| = — = 2sin 2r| < < 


Then from (34) and (35) we obtain 


7=0 
< +. 1) y| —A 
< | y| A. 
This proves the theorem when n = 2*. If n<2* we have 


* See the footnote at the beginning of this section. 


1934] WARING’S PROBLEM 423 


| f(x)| < ya—(1—aDe1) . 
< q~A | y|-4 


< y | 


THEOREM 40. (G, Theorem 15.) We have 
> f | f*(x) | 2| dx| < 
M2 


where Gss = 27*4G3;?*(2sA —1)-1(2sA —(2sA —1)0+3) —(2sA —1) 04+2)-1. 
By Theorem 39 


f | f*(x)|2| dx| < Ag~20A f 


Since the exponent of g is > —2 by (21), we have 


na 
< 22#4(2sA 2004 Dex 204+ (204-190 


q=1 
< 2%4(2s4 — 200A 
X (1 + ((2sA — 1)0 — 2sA + 2)—14(224-1)0e-2004+20) 
S 27*4(2sA — 1)-"(2sA — (2sA — 1)0 + 3)(2sA — (2sA — 1)0 + 2)-! 


As in Theorem 38 we have 2s(1—De,)aA —2a>X and the proof is complete. 
THEOREM 41. (G, Theorem 16.) We have 


J. 


2 
f(x) — v,*(x)| |dx| < Gn, 


where G1=2(Git+Gist+Gu+Ges+Gss). The summation extends over all p 
for which 1 <q <m*. There are terms in the sum. 


424 


We may write 


M, 


M, 
the accent indicating that the term which corresponds to M, itself is omitted 
in the summation and written separately. Using the inequality 


N 2 
j=l 

we obtain 


J lec 
<2D f +22 f | 
+25 f +20 fo | Dy 
M2 M; 
M, M, 
+20 | 
52D ff + f 
Ms Ms M; 
M, M, M, M, 
=? 21d 2 


M; M, M, C-M, 


To these terms we apply Theorems 36, 40, 35, and 34, respectively. 


= R. D. JAMES [April 


1934] WARING’S PROBLEM 425 


Then 


J. 


THEOREM 42. THE MAIN Lemma. (G, §9.) We have 


2 
f(x) — v,%(x)| |dx| < + Gis + Gas + Gos + Gas) 


>| o(j)| | 


j=1 


We note that 
¥,"(x) = + a)q-*S,* X the first + 1 terms of (1 — x/p)** 
+ a finite number of terms with higher powers of x/p 
+ higher powers 7 
+ higher powers; 7 
+ higher powers 
= + (F(sa + 5, 


+ higher powers. 
Also, 


f(x) =1+ + higher powers. 
j=1 
Hence 
f(x) — dov,*(x) = + xi + higher powers. 
j=1 
By Theorem 27, : 
1/2 n 
J | — "dy = >| o(j)| 2+ a positive quantity. 

jul 

Therefore 


1/2 
-1/2 


j=l 


by Theorem 41. 


| 
j=0 
n 


426 


Let 
oo(j) = (T*(1 + a)/T(sa))(T(sa + 7)/j)SU, k, s, ©). 
Then 


n 


o0(j)| 2< 


where 
By the proof of Theorem 4 we have 

| oo(j) — o(j)| = + a)/T(sa))(T (sa + D/AMSG, k,s, ©) —G(j, k, s,n°)|, 
< + + — | AQ). 


Also, by Theorem 16, 


Re | A(q)| + f 


qg>ne 
Hence 
| < 2|o()|? 
+ + a)/T*(sa))(sa/(sa — (sa — 7200-2 


>| o0(j)| 2 << 2G\n%e-1-r 


j=l 


+ 2(F%(1 + a)/P%(sa))(sa/(sa — (sa — 


1 


< 2G\n**2-!— + 2(T7*(1 + a)/T*(sa)) 
X (sa/(sa — (sa — 1)(2sa/(2sa — 
< 
6. The third and fourth Hardy-Littlewood theorems. In this section we 


again follow paper L in the proof of the third and fourth Hardy-Littlewood 
theorems which are here Theorems 43 and 45, respectively. 


THEOREM 43. (L, Theorem 346.) Let H(£) denote the number of positive 
integers j S & for which the equation 


(36) j= Lhe, 2 0, 


= R. D. JAMES [April 
j=l 


1934] WARING’S PROBLEM 427 


is not solvable. Then 
< Cost, 
where Cos =3A4/C105, Cros = +a) 
By the Corollary to Theorem 42 we have for £22 


j=1 


In the summation )\¢/2<;<¢ |oo(7) |? there are H(¢)—H(/2) terms in which 
rz,.(j) =0. For these terms 


| |? =| + a)/T(sa)) (P(sa + f)/jS|? > + 
(by Theorem 25 and the Corollary to Theorem 29) 
= > = 
In the remaining terms | o9(j) |?=0. From (37) we get 


(H(E) — < > | 2 < 
&/2<jSE 


H(é) — H(E/2) < = Cost. 


This holds also when 0<£<2 since (36) is solvable for 7=1 and then 
H(&) —H(é/2) =0. Hence for §>0 and every integer v=0 


H(§/2°) — H(&/2°*") < Cos(E/2°)'* = 
< (—1+A< — 2/3), 


H(t) = > (A(E/2*) — H(E/2**)) < Cost! 2-2°/8 


v=0 v=0 
< 3Cest!* = Cost. 


THeEoreEM 44. (L, Theorem 348.) Let L,,(m) denote the number of positive 
integers 7 <n for which equation (36) with s=s2 is solvable. Then 


where By = 2-1-2(2¢— 1), Q-1-20/(f 1)A)). 
(i) Let ss=2. The number of solutions of the inequalities 


(38) Sn, h, 2 0, 2 0, 


is at least equal to the number of solutions of 


(39) 0 hy < (n/2)2, 0 < (n/2)2, hy +- he > 0. 


4 


428 R. D. JAMES 


The number of solutions of (39) is 
([(/2)2] + 1)? — 1 > (m/2)24 — 1 > = 


> 2 = For each positive integerj <mthe equationj= hf +h? 
has at most (k—1)A.j*:<(k—1)A.m* solutions by the proof of Theorem 28. 


Therefore 
L.(n) > 1)A yn*1) = 


22-2¢2-2.C, (1-24) (1-4)? | 
111 


(ii) Let se>2 and assume that the theorem is true for s;—1, i.e., 
where Boo = Consider all integers 
(41) h*+<z 


such that 
his an integer >0, z is an integer, 


(42) n/2<h*<(h+1)*#<n, 


z is representable in the form h¥ , 


Since (h+1)*—h* > > 2h*-1 > = 2091-0 we have 


h*+z2<(h+1)*. This shows that to distinct pairs of values h, z of (42) cor- 
respond distinct integers (41). For suppose hf +2,=h# +z. Then 

ht < hf +2, = hf + 22 < + 1)+, 

< h# + 22 = ht + < + 1)* 
is impossible unless 4;=/2 and then 2z;=22. Moreover, each of the integers 
(41) is >Oand <(h+1)* <n. Therefore L,,(n) is at least equal to the number 


of pairs of values of h, z of (42). Since (n/2)*<h<mn*—1 by (42), the number 
of values / takes is 


— 1 — [(n/2)*] — 1 > — 2 — (m/2)* — 1 = 2-2(22 — 1)n* — 3 
= 2-!-¢(2¢ 1)n* = 


when n>(3-2!+2/(2*—1))*=cyo. The number of values z takes is 
L,,-1([n'-*]). Hence from (40) 


111 111 71 


= (1-20) (1—a)#s7-2— (1—a) 


[April 


1934] WARING’S PROBLEM 
TuHeorem 45. (L, Theorem 350.) For se as defined in §2 we have 
(2 — 2a)A > (1 — 2a)(1 — a)*-*. 


THEOREM 46. (L, Theorem 349.) For s and sz as defined in §2 and every 
integer n>C =max (Cros, C110, the equation 


s+s2 


(43) =n, hy = 0, 
t=1 


has at least one solution. That is, every integer >C is a sum of So kth powers 20 


when 
gilk, €1). 


Let m be an integer for which (43) is not solvable and write m=m +m. 


Then, since there are L,,(m) integers for which is 


solvable, there must be L,,(m) integers m:=n—m2<n for which 


(44) hy 2 0, 


i=1 


is not solvable. For if (44) were solvable for one of the L,,(m) integers m, then 
(43) would be solvable for contrary to our assumption. By Theorem 43 the 
number of positive integers <n for which (44) is not solvable is <Ceen'-. 


Hence 
L,,(n) < 


By Theorem 44, when »>max (Cio, C110) 
Therefore 
Bygn!— (1-24) Fa < 
(12a) (1-0) < Coe /Big, 
< Cg /Biy (Theorem 45), 
nA < Cee/ Big (A = 2A + (1 — 
It follows that (43) is always solvable when 


nm > max (C108, C110, 


7. The solution of (43) for integers <C. The following theorem is well 
known: 


430 R. D. JAMES [April 
THEOREM 47.* Jf every integer n for which f<n<h is a sum of s—1 kth 

powers =0 and if m is the greatest integer such that 

(45) (m+1)*—m*<h-—f, 

then every integer n for which f<n<h+(m+1)* is a sum of s kth powers 20. 
THEOREM 48. For L=(k+1)*—k*>k* we have 


4\* 2\* 1\* k(2k+ 7) 
2 3 3 2 9 
Consider any integer m such that 0<m<2*t!—2. If n<2*—1 it is obvi- 
ously the sum of 2*—1 kth powers, 0 or 1. If 2'<n<2*+'—2 we write 


n=2*+x, 0Sx<52*—2, and again m is a sum of 2*—1 kth powers since x is 
a sum of 2*—2 kth powers, 0 or 1. Hence every integer in the interval 


O<ns 2*1-2=h 


is a sum of 2*—1=m, kth powers 20. Since 2*—1*<h,<3*—2*, it follows 
from Theorem 47 with m =1 that every integer in the interval 0<”<h,+2* 
is a sum of m,+1 kth powers 20. We repeat this step m2 times so that every 
integer in the interval 0<<h,+m,2* is a sum of m,+m, kth powers 20, 


where 
(46) hy (me 1)2* < 3* — 2* < hy + 


We now apply Theorem 47 m; times with m=2 and conclude that every in- 
teger in the interval 0<<h,+m,2*+m;3* is a sum of m,-+m.+m, kth pow- 
ers 20, where 


(47) hy (ms 1)3* < 4k — < hy m33*. 
In general every integer nm such that 
t 
mij* 


j=2 


is a sum of Diam; kth powers 20, where 


t-1 t 
j=? jm? 
From (47) and (46) we get 
= 4* — hy < 4* — 3% + 25, 
and in general from (48) when #23, 


* L. E. Dickson, loc. cit. 


WARING’S PROBLEM 


t—1 
< (t+1)*— Do mgt — ky < (¢+1)* 1)*, 


jm? 
(49) 


Hence 


<mtm+ 14+ (1-7) 


j=l j=3 


k\ 1 k\ 1 
+( 


4/j7* 


sm +m—¢-2+(2) +(=) 
3 3 
k\ 1 
< mi +m:— (6-2) +(=) +(=) 
k\ 1 k\ 1 
4\t 
+(5) 
k\ 2 k\ 1 k\ 1 
=m +(=) 
3 3 
k 


+(2) 
9 3 3] 


From (46), m22*<3*—h,=3'—2*+14+2, and hence when t=k we get 
L=(k+1)*—k* and 


3 k 1 k 
= <2-1+4(=) -2+2(—) 


jun 


4\* 2\* 2k(k — 1) 


3 9 
3\* 4\* 
+) +26) 


2\* 1\* k(2k+7) 
(=) +2(5) 


1934] 431 

t t 


432 R. D. JAMES [April 


THEOREM 49. If every positive integer <L is a sum of s—1 kth powers 20, 
then every positive integer <(L/k)*/*-» is a sum of s kth powers =0. 


Since (L/k)*/(*-) —((L/k)/*-) SR(L/k)=L, we may apply 
Theorem 47 with m+1=[(L/k)!/“*-»]. Thus every positive integer 
<L+[(L/k)!/“*-» ]* is a sum of s kth powers =0, and L+[(L/k)?/(*-» ]* 
=> —1)*>(L/k) 


THEOREM 50. If every positive integer <L, where L>k*, is a sum of s3 kth 
powers 20, then every positive integer <C is a sum of s3+5, kth powers 20, 
where 


log C — log (log L — k log 
54 = 
: log k — log (k — 1) 


That is, every integer <C is a sum of So kth powers=0 when 


So 2 S3 +54 = go(k, 


By Theorem 49 every positive integer <(L/k)*/“*— is a sum of s3;+1 kth 
powers 20. Write L;=(L/k)*/“- and apply Theorem 49 again. Thus every 
positive integer <(Li/k)*/“*-) =L, is a sum of s3+2 kth powers 20. Also, 


Le = — = 
k (k-1)4+1 


log Lz = (k/(k — 1))* log L — (k/(k — 1) + (k/(k — 1))?) log &. 


In general, every positive integer <L,, is a sum of s3+5, kth powers 20, 
where 


log L,, = (k/(k — 1))** log L — (k/(k — 1) +--+ + (k/(k — 1))**) log & 
= (k/(k — 1))** log L — k((k/(k — 1))** — 1) log k 
> (k/(k — 1))**(log L — & log k). 
This expression is =>log C when 
7" log log C — log (log L — & log k) ; 
we log k — log (k — 1) 


8. Evaluation of the constants. We first prove three lemmas. 


LemMa 1. For w=5 we have 


+ < + —14+ (wt 1) logn. 


j=l 


1934] WARING’S PROBLEM 


Let [w]+1. Then 


Sates 
j=l 1 1 
t 
<2 ) 


+ 

= 2°+n— og + — 

2 1/ 2 


1 (‘)- 
2 1/ 2 2/2 4/ 6 
1 
4 


when t= [w|+126, we have 


2° + nm —1 + ([w] + 1) log + 2! 
+n—1+ (wt 1) logn. 


Lemma 2. For x=0 we have 


(27 — S (x log 
Consider the function 
y =  y’ = (27 — 1 — log 2)(27 — 1)-?. 


We have y’<0 when x20. Hence y attains its maximum value when 
x=0. That is, max y=1/log 2 and the desired result follows. 


433 
1 
t\ 1 
t\ 1 t\ 1 1 
| 
Since 
2 ty 2 af a 4/ 6 5/ 4 


434 R. D. JAMES 
Lemma 3. Let t be an integer 20. Then 
log (t!) S (¢+ 1) 
We have 


t t 
log (¢!) = log + logs = 1) 4-41. 
1 


The constants are now evaluated as follows. 

(ae) a, <7 log 2 (Theorem 2). 
(y(8)) = 48(28 + + 1) (Theorem 29). 
(y2(8)) v2(8) = + 1) (Theorem 29). 
(ya) vs = (1 + 12e)-! (Theorem 29). 
(cis) log cis = — 1) log k + — /(2k) (Theorem 15). 
(b18) big = (1 + (Theorem 24). 
= 2, 


> { (Theorem 10). 


Proof: By the proof of Theorem 10, r<4k—1 when p=2 and rS2k—1 
when p>2. Also, y Sk, since for p>2 


y¥=O0+15 20S 
for p=2,0>1, 
and for p=2,@0<1, 
Hence 
b(p) = Pr = p-" > { 2. 
(b4) log (1/bs) < (Theorem 25). 


Proof: From Theorem 25 


II a- pp”). 


PS bis p> bis 


PS bis p> bis 


Let 


1934] WARING’S PROBLEM 435 


Then 
Il: < Il | 


bis 
log S &(4k — 1) log 2 + &(2k — — &(2k — 1) log 2 
6 
< 2k log 2+ k(2k — »(= chis + 3 log* bis + 8 log bis + 5) 
‘ (Theorem 3) 
= 2k log 2 + k(2k — 1) (2 c(1 + 
+ 3(2/(s — 5))* log? (1 + &*) + 8(2/(s — 5)) log (1 + &*) + 5) 


6 
S 2k? log 2 + k(2k — 1) (< 622/65 228/18 +. 3(2/65)* log? (2k7°) 


+ 8(2/65) log (2k7°) + s) (since s = 70 by (2)) 

< 2k* — log 3 (k 2 6). 

Also, 
Il, < 1/( = f = 3, 
Pp n=l 1 

Therefore 

log (1/b4) = log II, + log [T, < 2k* — log 3 + log 3 = 285. 
(A) log A, < (Theorem 1). 


Proof: We have 
log A, = 2(2%) log 2 + ((3/2)™) log (3/2) 
+--+ — + 9((3/2)") +---). 
Since r((1+j-')") =0 and #((1+7-")") =0 when (1+ 7-')" <2, that is, when 
j>1/(2"—1), we may write 


log A; = > m((1 + j-')™) log (1+ 


j=l i=l 


where m= [1/(2"—1)]+1. By Theorems 2 and 3 


436 R. D. JAMES 


log A, < 


j=l 


$2 


j=l j=l 
— 13 (1+ 77!) — 15(n — 1») 
j=l 


< (a2 — che(2%t! + m — 1 + (mi + 1) log 2) 


12 
+ + m — 1 + + 1) log n) 


3 
— 1) log? 2 + 13, log + 1) + — 1) (Lemma 1) 


< (a2 — c)(e2™* + — + (1 + log ((28 — + 1)) 


12 
(24/9 — + (1 + ax) log — + 1)) 


+ (2 — + 13e, log — + 2) + 15¢,(2" — 1)-1 
S (aq — c)(e:2%*! + 1/log 2 + (1 + €1) log ((e: log 2)-! + 1)) 
+ < (€,2(ut2)/2 + 1/log 2 + (1 + €1) log ((e1 log 2)-! + 1)) 
+ 17e,(€, log 2)—! + 13e, log log 2)-! + 1) (Lemma 2) 
< (a2 — ¢)(e:2%*1 + 1/log 2 + (1 + €1) log 2m:) 
+ (€,2(mt2)/2 + 1/log 2 + (1 + €1) log 21) 
+ 17/log 2 + 13; log 3m. 
Since a2—c <4 this expression is <9¢2" when m212. By (12) 
m 2 17> 12 when k = 6, 
m 2 D+ 2**> 20 > 12 when & > 6, 
and hence log Ai: <9«,2". 
(Ae) log Az = (k — 2) log A; (Theorem 5). 
(A3) log As < log m: + log 10k (Theorem 7). 


[April 


1934] WARING’S PROBLEM 437 


Proof: 
log As = log (1/(ese*-*)) < log (1/es) = log (10m) 


since 10ke; =€. 


(Cis) log Cis = (k — 2) log Ai + K log 4 (Theorem 6). 
G;) log G; = log Ai + log (4k — 4) (Theorem 28). 
G) log G, < (k — 2) log Ai + log 71 + K log 4 
, + (k + 2) log k — k +1 + log 80 (Theorem 30). 
Proof: 


log G, = log (8k!C1sA3) = log 8 + log (k!) + logCis + log A; 
< log8 + (k + 1)logk —k+1+ (k — 2)logA,+K log 4 


+ log 7: + log 10k (Lemma 3). 

Gs) Gs < 3c15 (Theorem 31). 
Proof: 
Gs = + a) + 2ay(a)(1 — + (2a + 1)(a + 1)-") < 

(G;) G7 < C15 (Theorem 32). 
Gs) Gs < ¢15 (Theorem 32). 
G,) Gy = max G;,G; + Gs) < 3cis (Theorem 33). 
Gi) Gu < (Theorem 34). 


Proof: We have 
2sa — 1 — (2sa — 1) _ 2sa — 1+ 2K? 


2sa — 2 — (2sa— 1)0 —1 (by (19)) 
(k — 2)K-+ k + 2kK? 


Hence 
Gio = 27**(2sa — — 1 — (2sa — 1)6)(2sa — 2 — (2sa — 
< 
Gis) Gis < 15G,4 (Theorem 35). 


Proof: 


Gi3 = (27 + 1)2'4GA S (27 + A < 15G,4. 
Gis) log Gis < ((2s — 4)(k — 3/2)A +1) log A; (Theorem 36). 


438 


Proof: 
logGis = (2s — 4) log Gis + log Gs + (2a + €;) log 2 
= (2s — 4)A logG, + (2s — 4 )log 15 + logG; + (2a + €1) log 2 
< ((2s — 4)(k — 2)A + 1) log A; + (2s — 4)A(log 71 + K log 4 
+ (k + 2) log 2 — k + 1 + log 80) 


+ log (4k — 4) + (2s — 4) log 15 + (2a + e:) log 2 
2logy. 2K log4 
log Ai log A, 


= ((2s — 4)(k — 2)A + 1) log Ai + (s — 2)A log As( 


2(k+2)logk 2k—2 2 log 80 log (4k — 4) 
log A; ‘log A, (s — 2)A log Ay 

2log 15 (2a + log 2 

A log A, 


Each of the positive terms in the coefficient of (s—2)A log A: is <1/7 when 
m22k and thus 


log Gis < ((2s — 4)(k — 2)A + 1) log Ai + (s — 2)A log Ai 
since 7. >17>12=2k when k=6 and 7, >2*-*>2k when k27. 
Gis) Gis < 25GA (Theorem 37). 
Proof: 
Gig = (24 + 1)(2(3G4)4 + y(a)) S (24 + + (1/6) 
S (2m + + y(1/6))G4 < 25G4. 


(Ge) log Gag < 2s(k — 3/2)A log Ai (Theorem 38). 
Proof: We have 


2s +2 28(11-A—0) 
2s +1 2s +1 +2-—0 
2+23-0 


(by (20) and (2)). 


Hence 


Gog = (2s + 2)(2s + 3 — (25 + — 2s(1 — Der)A) < 
(2s + 1)(2s + 2 — (2s + 1)@ — 2s(1 — De,)A) 


R. D. JAMES [April 
+ 


1934] WARING’S PROBLEM 439 
logGe4 < 2s logGis + (2s + 2) log 2 < 2sA logG, + 2s log 25 + (2s + 2) log 2 
= 2s(k — 2)A log Ai + 2sA (log 91 + K log 4 + (k + 2) log k 
— k+1 + log 80) + (2s + 2) log 2 + 2s log 25 


2logym. 2Klog4 2(k+2)logk 
log Ay log Ai log Ai 


= 2s(k — 2)A log Ai +sA log 


2k—2 21log 80 (2s + 2) log2 
log A; log Ai SA log A, A log A, i 


As before the coefficient of sA log A: is <1 and thus 


log Gag < 2s(k — 2)A log Ai + SA log Ai. 
(Ges) log Gas < 2s(k — 3/2)A log Ai | (Theorem 38). 
Proof: 


(2sa—2—2a)(2sa—3) ((k — 2)K — 2)((k — 2)K — k) 
(2sa —3—2a)(2sa— 4) ((R—2)K — k — 2)((k — 2)K— 2k) 


Also 
log Gy < log cis + log 3 = (k — 1) log k + an(k — 2)k?*/(*-2)/(2k) + log 3 
< (k — 1) log k + 7(log 2)k*/3 + log 3 


< 2 (by (2)). 


< (k — 2)A log Ai (m 2 2k) 
<A log G, < log Gis. 
Therefore 
log Gos < (2s + 2) log 2 + (2s — 2) log Gy + 2 log Gis 
< (2s + 2) log 2 + 2s log Gis 
< 2s(k — 3/2)A log Ay 
as in (Gy). 
Gs:) Gai < 25G4 (Theorem 39). 
Proof: 
Gar = (2m A < (Qe + A < 2564. 


Gss) log < — 3/2)A log Ay (Theorem 40). 


(2s — K)@ — (2s — 3K) _ 2s — K + 2K? 
(2s — K)@ — (2s — 2K) 2s —K 

2)K + 2k—-K-+2K* 3)K + 2K* 3+ 2K 
(k—2)K+2k—K (k — 3)K 3 


(by (21)) 


< 2K? = 22-1, 


Hence 
Ga < 
log G35 < 2s log G3; + (2sA + 2k) log 2 
< 2sA log Gy + 2s log Gos + (2sA + 2k) log 2 
< 2s(k — 3/2)A log Ai, 
as in (Gx). 
(G;) log G; < ((2s — 4)(k — 3/2)A + 1) log Ai + log 10 (Theorem 41). 
Proof: Consider first Gip. We have 
log Gio < 2s log Gy + (2sa + k — 1) log 2 + log 3 
< 2sA log G, + (2sa + k — 1) log 2 + log 3 
by the proof for (G25), and then as before 
log Gio < 2s(k — 3/2)A log Ai. 
Also, 
(2s — 4)(k — 3/2)A +1 = 2s(k — 3/2)A +1 — 4A(k — 3/2) > 2s(k — 3/2)A. 
Hence 
Gy = 2G + Gis + Gog + Gos + Gis) 
S 10 max Gio, Gis, Gos, Ges, Gas), 

log Gi < ((2s — 4)(k — 3/2)A + 1) log Ai + log 10. 
(A,4) log A, < ((2s — 4)(k — 3/2)A + 1) log A1 + log 22 (Theorem 41). 

Proof: 

A, = 2G, + + a)/T(sa))(sa/(sa — 1))9(2sa/(2sa — 
< 2G, + < 2G, + (by (Gss)) 
< 2G: + G33) < 2G: + Ga) S 2(10 max Gio, --- , Gas) + Gas) 
< 22 max --- , Gss), 

log A, < ((2s — 4)(k — 3/2)A + 1) log Ai + log 22. 


440 R. D. JAMES [April 
Proof: 


1934] WARING’S PROBLEM 441 


(C104) Ciog = + (Theorem 43). 
(C105) C105 = (Theorem 43). 
(Ces) log Ces = log Ag + log (1/c105) (Theorem 43). 
(Coe) log Cos < ((2s — 4)(k — 1)A + 1) log Ay (Theorem 43). 
Proof: 
log Ces = log Ces + log 3 = log Ag + log (1/ci05) + log 3 
< ((2s — 4)(k — 3/2)A + 1) log Ai + (2sa — 2) log 2 + 2 log I'(sa) 
— 2s log T(1 + a) + 2 log (1/b,) + 2 log (1/73) + log 66 

(2sa — 2) log 2 
(s — 2)A log Ay 


< ((2s — 4)(k — 3/2)A + 1) log Ai + (s — 2)A log As( 


2 log I'(sa) 2s log 2 

(s— 2)A log A, (s — 2)A log Ai 

2 log (1 + 12e) log 66 

(s—2)AlogA, (s—2)A 
As before each of the terms of the coefficient of (s—2)A log A: is <1/6 when 
m >2k+5. Since 

m>17=2k+5 when k=6, m>33>19=2k+5 when k =7, 
m > 2*-%>2k+5 when & 2 8, 


we have 
log Cos < ((2s — 4)(k — 3/2)A + 1) log Ai + (s — 2)A log Ai. 
(C109) C109 = 2722-1 (Theorem 44). 
(C110) Ciro = 3*-2*+1(2¢ — 1)-* (Theorem 44). 
(¢111) C11 = — 1) (Theorem 44). 
(Cr) log (1/C11) = (k — 1) log Ai + log (1/c109) (Theorem 44). 
(Big) log (1/Bis) < (2s — 4)A log 11:+(k—1)log Ai (Theorem 44). 
Proof: 
log (1/Big) = (s — 2) log (2/¢111) + log (1/C7) 
= (k — 1) log A; + (s — 2) log (27+#/(2* — 1)) + (2a + 1) log 2 
(s — 2) log (2?+9/(2* — 1)) 
(2s — 4)A log Ay 


= (k — 1) log Ai + (2s — 4)A log As( 


(2a + 1) log 2 ) 
(2s — 4)A log 


442 R. D. JAMES 


As before the coefficient of (2s—4)A log A; is <1 and thus 
log (1/Biy) < (2s — 4)A log Ay + (k — 1) log Ai. 
(C) log C < 20k32" (Theorem 46). 


Proof: 
log C = k2*-*(log Ces + log (1/Bi9)) 


< k2*-2((2s — 4)kA + k) log Ay 

< 9k2*-2((2s — 4)kA + (by (A,)) 

< 20k%2" (by (2) and (17)). 
9. Proof of the main theorem. We prove the following 


THEOREM. We have 


1 
g(k) 


1/2 
+(@+eD+0- 2) + 4F(ED + R)) )|+ 1, 


By Theorem 46 every integer =>C is a sum of so kth powers when 
So2S+ 
So = (H + (1 + (1 — — +2+44+% 
= (Am + (1 + (1 — + — D)(6 + $%))(m — 

(50) so= ((H +Q)m + R)(m — 
Also, by Theorem 50 every integer <C is a sum of so kth powers if 

So 2 53 + (log log C — log (log L — k log k))(log k — log (k — 1))“! 

$3 + (3 log k + log 20 + m log 2 — log (logZ — k log k)) 
X (log & — log (k — 1))"? (by (C)), 

(51) so=> E. 
The right members of (50) and (51) are equal when 


(52) 
+ 4F(ED + R)) (2F)- 


[April 
— g(k) 1 
— > 
= k=o k-2*-1 2 


1934] WARING’S PROBLEM 


and then every integer is a sum of sp kth powers 20 when 


1/2 


It remains to show that this choice of 7; satisfies the condition (17). Since 
E>2* we have ED+R>0 and thus 


m > (H +FD +0 — 
(53) m—-D>(H+0Q- 
Also, 
H+Q—E= (k— 2)2°*+k+6+ — 5:3 
— (3 log k + log 20 — log (log L — log k)) (log k — log (k — 
> (k — — 2* 


— (3 log k + log 20 — log (log L — k log k))(log k — log (Rk — 1))“' 
= (k—6)2**+k+6+h: 


-9) 


— (3 log k + log 20 — log (log L — k log k))(log k — log (k—1))=?; 

and 

= log 2(log k — log (k — 1))7?. 
Hence (H+Q—£)F-'!—2*- is an increasing function of k which is positive 
when k=7 so that 

for all k=7. Then from (53) 

m—- D223 

for all k=>7. When k =6 direct substitution in (52) yields 7, >17. 


To obtain the values of g(k) which are given in the introduction we re- 
quire the following: 


443 


444 R. D. JAMES 


Every integer from to is a sum of 


11-2° 12-2° 39 6th powers 


25-27+6-3? 26 -27+6-3? 58 7th powers 


25 26-2°+9-38 120 8th powers 


38 -2° 39-29 285 9th powers 


57-210 58-210 737 10th powers 


By repeated application of Theorem 47 as indicated in the proof of Theorem 
48 we obtain the following values for L and ss. 


Every positive integer < is a sum of 


alin 73 6th powers 


143 7th powers 


gg 279 8th powers 


10°” 548 9th powers 


10%" 1079 10th powers 


CALIFORNIA INSTITUTE OF TECHNOLOGY 
PASADENA, CALIF. 


= 
| 
| 


ALMOST PERIODIC FUNCTIONS IN A GROUP. I* 


BY 
J. v. NEUMANN 


INTRODUCTION 


1. The object of the present paper is to extend H. Bohr’s famous theory 
of almost periodic functions [4, I]f to arbitrary groups, and to show that it 
gives just the maximum range over which the fundamental results of Fro- 
benius-Schur representation theory [21; 22; 30] and its extensions by Peter 
and Wey] [32] hold. We shall see in particular that all bounded linear repre- 
sentations of a group are equivalent to unitary representations and belong to 
this class. Another point of importance is that we free ourselves completely 
from all topological assumptions (such as continuity, etc.) by the use of a defi- 
nition of almost periodicity due to Bochner [2]. Thus we find that the general 
theory, which applies to every group G whatsoever, is completely free from 
topological assumptions, but all of its results (for example, all series expan- 
sions) have a property of closure; if applied to functions which are continuous 
in a certain topology, they will lead only to functions of the same kind. It is 
remarkable that we find in the classical case of Bohr new almost periodic 
functions in addition to the known ones; even the elementary functions 
f(a) =e? can be generalized (this connects with results of Ursell [28]). On 
the other hand, in some groups (for example in all semi-simple Lie groups) 
almost periodicity automatically implies continuity (this will be proved with 
the aid of a theorem of van der Waerden [29]). 

2. The principal difficulty in building up a general theory of almost peri- 
odic functions lies in finding a generalization of the Bohr integral mean 


lim f(x)dx 


T?2 —T 


if the real numbers x and T are replaced by the elements of an arbitrary group 
@ which need not be even topological; also, the function f(x) may be discon- 
tinuous. We meet this difficulty by finding an entirely new definition (cf. 
Definitions 4 and 5) which may be proved to be fit for the role of a “mean” 
under all conditions. The direct discussion of our mean is very simple and is 
given in Part I. 


* Presented to the Society, March 31, 1934; received by the editors December 18, 1933. 
t The numbers in brackets refer to the bibliography at the end of the paper. 


445 


SOSTON UNIVERS TY 


446 J. v. NEUMANN [July 


This mean is an extension of an integral in compact groups previously 
defined by the author [19]. It is defined by means entirely different from 
those employed in Haar’s integral [11] with which it coincides for compact 
groups, but from which it differs widely for non-compact groups, first, be- 
cause for such groups it is an integral-mean and not an integral; second, be- 
cause it is free from topological restrictions, while Haar’s integral applies only 
to locally compact and separable groups; third, because it is defined for al- 
most periodic functions while Haar’s integral is defined for measurable func- 
tions, and in general neither of these two classes contains the other. 

3. The content of Parts I-V is as follows: Part I gives our general theory 
of the mean. Part II applies this theory (by using the powerful method of 
Wey] [31 ]) to prove the fundamental theorems of the Bohr theory, Parseval’s 
formula and the approximation theorem. As we have to combine the devices 
contained in two papers of Weyl [30; 31] we find it advisable to give the 
proofs in full, even though the repetition is often almost literal. Part III re- 
peats the main results of the Frobenius-Schur and Peter-Weyl theory of rep- 
resentations, and connects them with the theory of almost periodic functions. 
It provides a basis for the statement that the present general theory of almost 
periodic functions is the widest range over which this theory of representa- 
tions holds without any loss of strength. Part IV connects our theory with 
topological and other restrictive conditions. By investigating the details of 
eight examples we illustrate the principal types of combinations of these no- 
tions which are likely to occur. Finally, we discuss the question as to how 
many almost periodic functions exist in a given group. Part V is entirely de- 
voted to the proof that the maximal amount exists in Abelian groups (subject, 
however, to certain topological restrictions). Here the integral of Haar is used 
in combination with certain theorems of the author on operators and func- 
tions of operators [17]. The extension of some results of Haar on countably 
infinite Abelian groups [10] is of great importance for these investigations. 

4. It is probable that most of the further developments of the Bohr theory 
will also apply to our general theory. Among these developments are finer 
convergence theorems, summability theorems, and Stepanoff’s generaliza- 
tions (where some topological restrictions will be necessary, as the Haar in- 
tegral must be applied). In this connection it may be of interest to point out 
a needed generalization of an important notion of the Bohr theory, namely, 
the fact that the product of two elementary almost periodic functions is a 
function of the same kind: e?**ig?"#e+=¢?*Q+ei This is unchanged for 
Abelian groups and leads to the important character-group; but in non- 
Abelian groups the corresponding situation is that the direct product of two 
irreducible representations (the elements D,,(a; €) of which are the analogues 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 447 


of e?*4*4, cf. Definitions 11 and 12, and Theorems 24 and 28) is a sum of a 
finite number of irreducible representations, that is, there is the so-called 
composition formula 


(*) D,«(a; ©)D,.(a;D) = & tT, v; D)Deq(a; €). 


Another important notion in Bohr’s theory is the independence of the ex- 
pansion functions e?**+, e?*#si, . . . (that is, the linear independence of their 
exponents with integral coefficients), since almost periodic functions with 
such expansions possess particularly simple convergence properties. The 
corresponding requirement in our general theory is probably that the right- 
hand member of (*) should contain no term originating from the representa- 
tion D(a; €)=1 if the left-hand member is any product of powers of 
D(a; 6), D(a; D), 


I. EXISTENCE OF THE MEAN, GENERAL PROPERTIES 


5. Let @ be a group, that is, a set in which the operations ab and a 
are defined and satisfy the group postulates. While G© may be topological* 
this property is not needed in Parts I-III and we do not yet make this as- 
sumption concerning @. Elements of G will be denoted by a, b,c, x, y,2,---, 
real or complex numbers by m, n, u, v, a, B, €, n, - - - , and functions defined 
in @ with complex numbers as values by f(x), g(x), ---. 

For such functions f(x) and g(x) we define distance} by 


D(f, g) =1.u.b.2| f(x) 
A set IN of such functions is called conditionally compact (c.c.) if every se- 
quence fi, fo, - - - extracted from it contains a subsequence f,,, fn,, such 
that D(fn,, fn)—0 as u,v (that is, a “fundamental” subsequence [13, 
p. 107]); this means that there exists a function f (not necessarily belonging 
to M) such that D(f,,,, f) 

We now extend Bohr’s notion of almost periodic functions [4; 2, §5] to 
all f(x) in G, but we prefer to generalize the definition given by S. Bochner 
[2], as it allows us to rid ourselves completely of topological conditions on 
f(x) (continuity, etc.). 


* That is, a topological set in the sense of Hausdorff [13, pp. 226-230]. One may take his topo- 
logical system based on the notion of a neighborhood by means of Axioms 1, 2, 3 (or A, B, C) and 
one of the “separation” Axioms 4-8, such as 5. Furthermore, certain continuity assumptions have to 
be made concerning ab and a. In Parts I-III we shall need no topology at all, in Part IV we must 
assume that ad is continuous in a for fixed 6 and in b for fixed a, and in Part V we must assume that 
ab is continuous in (a, 6) and that a~! is continuous in a. 

t We shall consider only bounded functions. l.u.b.z denotes the least upper bound for all x’s in ©. 


448 J. v. NEUMANN [July 


DEFINITION 1. A function f(x) in © (with complex values) is called right 
almost periodic (r.a.p.) if the set R; of all functions f(xa) (x is the variable, a 
is a parameter running over @) is c.c.; it is called left almost periodic (l.a.p.) if 
the set Ly of all functions f(ax) is c.c.; it is called almost periodic (a.p.) if it is 
r.a.p. and l.a.p. 

The equivalence of this definition to the obvious generalization of the 
Bohr definition is shown in the usual way if f(x) is continuous; similarly, the 
uniform continuity of f(x) follows in this case. But as we do not wish now to 
assume any topology in G, we shall not go into the details of this matter. On 
the other hand, the following theorems are of major importance: 


THEOREM 1. Each of the three notions r.a.p., l.a.p. and a.p. is invariant 
under the following operations: f(xa), f(ax), f(x) , af (x) (a any complex number), 
f(x) +2(x), f(x) g(x), and the operation of passing from f,(x), fo(x), - - - to f(x) if 
f,(x) converges uniformly to f(x) as Passing from f(x) to inter- 
changes r.a.p. and l.a.p. and leaves a.p. invariant. 

The statement concerning f(x~') is obvious. In the other cases we need 
to consider only r.a.p., as l.a.p. results, for example, by replacing ab by ba 
when defining G, and a.p. results by combining r.a.p. and l.a.p. That f(xa) 
is r.a.p. is seen by replacing ai, a2, - - - (Definition 1) by aa, aza, - - - ; that 
f(ax) is r.a.p. results from replacing x by ax; the situation concerning f(x) and 
af(x) is obvious; the r.a.p. of f(x) + g(x) and of f(x)g(x) is proved by applying 
Definition 1 first to f(x) and ai, ae, - - - , and then to g(x) and the subsequence 
which has been selected. An obvious and simple application of the diagonal 
process shows the invariance of r.a.p. under the operation of passing from 
to f(x). 


THEOREM 2. Every r.a.p. or l.a.p. function f(x) is bounded. 


Again it is sufficient to consider r.a.p. If f(x) were not bounded, we could 
select a sequence ai, 2, - - - such that |f(a,)| +0 as n>, and then no 
subsequence of f(xa:), f(xa2), - - - could have a finite limit at x«=1. 


DEFINITION 2. If Mis a set of functions in G, we call the set of all functions 
onfi(x)+ +eanfn(x) (n=1, 2,-- 501, , non-negative real numbers 
such that + +++ +an=1;fi,---, fn any elements of IN) the convex of M 
and denote it by Co(M). 


6. We prove 
THEOREM 3. If either of the sets I and Co(M) is c.c., the other is also c.c. 


If Co(M) is c.c., its subset M is c.c. Conversely, suppose that M? is c.c. 
The c.c. property of a set Mis equivalent to the following condition: for every 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 449 


¢>0O there exists a finite number of functions fi, - - - , fm of % (m=m/(e)) 
such that, for each fe N, some D(f, f,) Se, u=1, -- -, m [13, pp. 108-109]. 
Now if an e>0 is given, choose the functions fi, - - - , fm for M and e, put 


max l.u.b. | 7,(x)| = C, 


and select an integer N >Cme-. Then f,+ - - - +8mfm (where Bi, - , Bm 
are non-negative rational numbers with denominators N such that 
Bit ---+8mn=1) can be written as a finite sequence fi, - - - , Zw and have 
the property described above for Co(Q) and 2e. 

DeriniTI0n 3. If f(x) is a real bounded function in @, we call 

l.u.b.2,y| f(x) —f(y)| (x and y vary independently over @) 

the oscillation of f(x) and denote it by Osc.f(x). If f(x) is not a constant, Osc.f(x) 
>0; otherwise, Osc.f(x) is zero. 

THEOREM 4. For every real go CoRy; we have Osc.g(x) SOsc.f(x). If the 
relation Osc.g(x) <Osc.f(x) never occurs, and if f(x) is l.a.p., f(x) is necessarily 
a constant. 


The first statement is obvious. Suppose that the assumptions of the sec- 
ond statement are valid. Let a/, - - - , a, be any elements of @; then 


+ f(xa,) 


n 


cCo R;, 


and thus 


Osc, +--+ + f(xa,) 


n 


This implies that 
f(xai) +--+ + f(xa,) 
.u.b, 


nN 


] 


= f(x). 


Put l.u.b.2 f(x) =C; then for every «>0 there exists an x’ such that 
f(x'ay) +--+ + f(x'an) 


n 


As all f(x’a,’) SC, they all must also be >C—ne. 
Now choose a 6>0 and find a finite number of elements of ZL; such that 
each element of L; has a distance <6 from one of them (cf. the proof of The- 


450 J. v. NEUMANN [July 


orem 3). That is, find a finite number of elements a, - - - , a, of G such that 
for every a of G there exists a u=1, 2, - - - , m for which | f(a,x) —f(ax)| <6 
identically. Now choose a b of & and repeat the argument just described in 
the case where e=5/n, af - - - , Thus an x’ exists for which 
all f(x’az1b) >C—6, v=1, 2,---, m, and therefore, for a properly chosen 
u=1,2,---,m, all f(a,a7%b) >C—25. If v=p, then f(b) =C—26. 

On the other hand, f(b) <C and, as 6 was arbitrary, it follows that f(b) =C. 
Finally, f(x) is constant since b was arbitrary. This completes the proof of the 
theorem. 


THeEoreEM 5. If f(x) is a.p., there exists a constant A toward which a certain 
sequence extracted from CoRy converges uniformly. 


Since f(x) is r.a.p., Ry and CoR, are c.c. Denote real and imaginary parts 
by ® and & respectively, consider the non-negative numbers Osc, %g(x) 
+Osc. Sg(x), ge CoR,; and call their greatest lower bound w. We can extract 
a sequence gi(x), ge(x), - - - from CoR; such that Osc, S¥gn(x) 
—w as and from this a subsequence gn,(x), gn,(x),---, which 
converges uniformly to a function g(x). Hence Osc, Stg(x)+Osc. $g(x) =w. 
It is obvious that, f(x) being l.a.p., every element f(xa) of R, is l.a.p. There- 
fore every element of CoR; is l.a.p., and the uniform limit g(x) as well as the 
real functions Mg(x) and %g(x) are l.a.p. If we show that Osc, Mg(x) 
=Osc.%g(x)=0, we have g(x)=constant, S$g(x)=constant, that is, 
g(x) =constant, which proves our statement. 

Suppose that Osc, Stg(x) >0. Then Theorem 4 shows that an he CoRR, 
exists such that Osc.h(x) < Osc.%g(x). Here h(x) =a,Rg(xai)+ - -- 
+anRg(xan) each 20, art---+a,=1). Putting k(x) 
=aig(xa1)+ --- +ang(xa,), we have h(x)=Rk(x), so that Osc, Rk(x) 
<Osc, g(x). But it is obvious that Osc, $k(x) <Osc. Therefore 
Osc, Rk(x)+Oscz Ik(x)<w. Now g(x) can be uniformly approximated by 
functions /¢ CoR,, that is, =Bif(xbi) + - - - +Bnf(xbm) (Bi, , Bm each 
=0,8it+ --- +8n=1). Hence k(x) can be uniformly approximated by func- 
tions q(x) +arBef(xaibe) + +0nBnf(xdnbm), that is, by func- 
tions gcCoR;. Since the relation that 
Osc, Rg(x)+Osc. Yg(x)<w results. This contradicts the definition of w. 
Similarly Osc, g(x) >0 is disproved. 


Remark. [f a finite number of a.p. functions f;(x), - - - , are given, it 
is possible to find a set of constants Ay, - - + , At, toward which t sequences ex- 
tracted from CoR,,,---, CoRy, respectively, with the same a, - Qn, 

-+, Gn, converge uniformiy (that is, sequences of the form ox”) fi(xa;”) 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 451 


+ tan!” fi(xan”), f )+ tan!” fi(xdn” ), where 


a 20, - ++, aa” 20, and a+ -- =1). 


The argument which proved Theorem 5 may be repeated here if we use 
Oscz Rfi(x) + Sfi(x)+ - - - +Oscz Rfi(x)+Osc. instead of 
Osc, Rf(x)+Oscz Sf(x). 


DEFINITION 4. A real number A which may be uniformly approximated by 
functions from CoR; or CoL,, that is, a number A such that, for every e>0, 
there exists a number n=1, 2,---, numbers a,---, Qn each =0 with 
a+ ---++a,=1, and elements a,---, dn of such that the condition 
+enf(xan)—A|<e or louf(ax)+--- +anf(anx)—A| Se 
holds throughout ©, is called a right-mean or a left-mean of f(x) respectively. 


THEOREM 6. If f(x) is a.p., it has exactly one right-mean, exactly one left- 
mean, and these means are equal. 


The existence of a right-mean has been proved by Theorem 5. If we change 
the multiplication law ab in & to ba, all notions remain unchanged except for 
the interchange of “right” and “left.” Thus a left-mean must exist. ill 

Now let A be a right-mean, let B be a left-mean, and let e be >0. Choose 


+ + anf(xa,) —A| 
| Bif(biz) + +++ + Bnflbmx) — B| Se. 


If we replace x in the first equation by b,x, - - - , b,x in succession, and’add, 
we obtain 


| + +--+ + — A| Se. 


Similarly, if we replace x iu the second equation by xa, - - - , xa, in succes- 
sion, we obtain 


| f(b1%01) + a8 2f(b2xa1) OnB mf (bmxOn) = B| Se. 


Therefore | A —B| <2e and, as ¢ may be arbitrarily small, A =B. 
Derinition 5. If f(x) is a.p., we call the common value of its uniquely 

determined right- and left-means the mean of f(x), and denote it by M .f(x).* 
We now state the most important properties of the mean. 


* Definitions 3-5 and the argument of Theorems 3-6 are in very close analogy to the author’s 
construction of the Haar-Lebesgue measure in compact groups [19]. It is noteworthy that for non- 
compact groups, where Haar proved by his method the existence of an integral [11], our method 
leads to an integral-mean. 


452 J. v. NEUMANN [July 


THEOREM 7. If f(x) and g(x) are a.p. functions, all the functions f(xa), 
flax), f(x), f(x), of (x), f(x) +2(x) (a a complex number, a an element of 
are a.p. (cf. Theorem 1). Furthermore, we have the following: 


(1) M.[af(x)]=aM.f(x). 

(2) M{f(x) +g(x)]=M.f(x) + M.g(z). 

(3) M,1=1. 

(4) If f(x) is real and =0 throughout ©, then M f(x) =0; and if, in addition, 

f(x)40, then M.f(x) >0. 

(5) | M.[f(x)]| 
(6) M.[f(x)]=M.[f(«)]. 

(7) M.f(xa) =Mf(x). 

(8) M.f(ax) =M,f(zx). 

(9) M.f(a—) 

The equations (1), (3), (5), (6) and the first half of (4) are obvious; as 
every left-mean of f(x) is a left-mean of f(xa) and as every right-mean of f(x) 
is a right-mean of f(ax), (7) and (8) are valid; as every right-mean of f(x) 
is a left-mean of f(x~"), (9) is true. Thus, only (2) and the second half of (4) 
remain unproved. 

In order to prove (2), put M.f(x) =A, M.g(x) =B, let e be >0, and choose 
a1, +, Qn +, each 20, a+ - +a,=1) and a, - +--+, such 
that 

| af (xa1) -++ +anf (xan) —A| <= 


Now aug(xa:)+ - - - +a,g(xa,) obviously has the same left-mean as g(z), i.e., 
B. Therefore we can choose f;,---, Bm each 20, 
+6n=1) and ---, such that 


+ + + anBng(xbmadn) — B | Se. 


If we replace x in the first inequality by xb, - - - , xb, in succession, and add, 
we obtain 


| + +--+ + OnBmf(xbmdn) — A| Se. 


Denote nm by p; m by Yt > each 20, 
Mit +¥p=1); didi, , bndn by C1, Cp; we get, by adding 
and substracting our inequalities, 


| va(f(er) + g(xer)) +--+ + yp(f(xep) + g(xcy)) — (A + B)| S 26. 


As may be arbitrarily small, this shows that M,[f(x) + g(x) ]=A+B. (An- 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 453 


other way to prove (2) would be to apply the Remark following Theorem 5 
to f(x) and g(x).) 

In order to prove the second half of (4), assume f(x) =>0 everywhere and 
f (xo) >0 for one particular x. For any e>0 a finite number of elements of 
R; exist such that each element of Ry; has a distance <e from one of them 
(cf. the proof of Theorem 3). Hence there is a finite number of the elements 
@, such that, for every a, there exists a u=1, 2, - - - , m such that 
| f(xa,) —f(xa)| < «identically. Now take e=f(x9)/2. The substitution x 
shows that f(xoa-1a) =f(xo)/2. Hence, for each a it follows that f(x a7'a) 20 
for every v=1,---, m, but that f(xoa7'a) =f(xo)/2 for at least one v. Thus 
+ --- +f(xoar'a) =f(x0)/2, that is, the function g(y) =f(xsary) 
+ +++ +f(xoarly) —f(%0)/2 is always 20. Hence the first half of (4) leads 
to the result that M,g(y) =0, (2), (3), (6), (7), and (8) show that M,g(y) 
=nM,f(y) —f(x0)/2, and it follows that 


2n 


> 0. 


M,f(y) 2 


THEOREM 8. The formal properties (1)—(9) determine M f(x) uniquely; in 
fact, (1)-(3), the first half of (4), and (7) or (8) are sufficient. 

It is sufficient to consider (1)—(3), the first half of (4), and (7), as (8) may 
be obtained by replacing ad in G by ba. So assume that a functional M7 f(x), 
defined for all a.p. f(x) and satisfying (1)—(3), the first half of (4), and (7), 
is given. 

For every we can choose ai, , @n, 41, * Gn (1, each 
art such that |ouf(xa:)+ -- +anf(xa,)-M.f(x)| Se, 
or if f(x) is real, 


M .f(x) — € S +--+ + anf(xan) S M.f(x) 


Then (1)—(3), the first half of (4), and (7) show that M.f(x)—-e<M/! f(x) 
<M.f(x)+e, and as e was arbitrary, M/f(x)=M.f(x). Property (1) with 
a=i shows that this holds also for pure imaginary f(x), and property (2) 
shows that it holds for every f(x). 

Theorems 6-8 show that, for a.p. functions f(«), there is exactly one way 
to define a notion M f(x) possessing the essential formal properties of a mean. 
Our M,f(x) is the equivalent of the well known integral mean 


1 T 
lim —]| f(x)dx 


Toe 27 J_p 


in Bohr’s theory, when G is the addition group of all real numbers. But even 


a 

an 


454 J. v. NEUMANN [July 


in this case the form of our definition is essentially different from the usual 
one (for example, it does not use continuity), and it gives a new approach to 
the problem. 

ReMARK.* The notion of the mean can be modified in the following manner. 
Consider the doubled group GG’, that is, the set of all pairs [a, a’]. This set 
is a group by virtue of the definitions [a, a’] [b, b’|=[ab, b’a’| and 
[a, a’ |-1=[a-", a’-]. (This is similar to the construction in the next para- 
graph except that here we use 6’a’ while there we use a’b’.) The argument in 
the proof of Theorem 9 below shows that if f(x) is a.p. in @, then 
fo( (x, x’ ]) =f(xx’) is a.p. in GG’. By Theorem 5 it then follows that there 
exists a constant A such that, for every «>0O, there exists a number 


n=1,2,---,numbersa, ---,a,each 20 witha:+ --- +a,=1, and ele- 
ments and - - - , b, of G such that the condition 

| axfo([x, anfo( (x, y][an, | € 
holds for all « and y in G. If we write - - , Cx this condition 


assumes the form 
| arf(xery) +--+ + anf(xeny) — A| Se. 

This mean is even easier to handle than our right- and left-means (which 
are special cases of it). This is due to the following fact: choose two arbitrary 
sets of numbers fi, -- - , and yi, --- , yz, each 20, with --- +6,=1 
and yi+ --- +7:=1, and two arbitrary sets of elements a, ---, a and 
- , of G. In our last inequality replace x and y by xa, and dy, multiply 
by Bey, and add over all x=1,---, k; X\=1,---, 1. Then we obtain an 
inequality of the same type except that there are knl terms instead of 
terms and §,a,7, and a,c,b, appear in place of a, and a,. This shows that 
the conditions 


| +--+ +amf(xemy) -A| Se, 
| ce! y) +--+ + an y) — B| Se 


imply the conditions 


| f(xer'y) +--+ + annf(xemny) — A] Se, 
| g(xer'y) +--+ + — B| Se 


if aj’ and c/’ are a,a, and c¢,c, (in some order). This gives the uniqueness, 
the extension to complex f(x), and the additivity of our new mean at once. Of 
course this mean coincides with our former means. 


* Added February 4, 1934. 


| 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 455 


7. The applications to be made in the next chapter necessitate our prov- 
ing some facts concerning double means. We therefore pass to this subject. 

The group © can be “doubled,” that is, we can consider the set GG of all 
pairs [a, a’], which by the definitions [a, a’] [b, b’]=[ab, a’b’], [a, a’|- 
= [a—!, a’-!] becomes a group, and we will denote functions in it by f(x, x’) 
instead of by f([x, x’]). All our notions apply to GG: we have a.p. functions 
f(x, x’) in GG, and a mean M,, f(x, x’). 


THEOREM 9. If f(x) is a.p. in G, the eight functions f(xx'), f(x’x), 
f(a’—x-) are all a.p. in GG. 


Interchange of x and x’, of ab in G with ba, and of f(x) with f(x-") reduces 
our task to discussing f(xx’) and f(xx’-") alone. Their a.p. character in GG 
means that the sets of functions f(axa’x’), f(xax'a’), f(axx’—'a’-), 
f(xaa’—'x’-) in GG are c.c. or else that the sets of functions of one or of two 
variables, f(axby), f(xayb), f(axb), f(xay) in © are c.c. The third case arises 
from the first by setting y =1, the fourth from the second by setting )=1, and 
the second from the first by interchanging a and x with b and y, and ad in G 
with ba. So we need to discuss only f(axby). 

Choose an e>0. As f(x) is r.a.p., there is a finite number of elements 
V1, * Ya, Such that for each y there is a v=1,---, m for which 
|f(zy) —f(zy,)| Se identically. As each f(xy,) is r.a.p., the set of all functions 


f(xby,) is c.c. for every v=1, - - - , m, and therefore even the set of all “vec- 
tor-functions” with components [f(xby:), - - - , f(xby,) ] is c.c. Therefore a 
finite number of elements 0, - - - , 6, exist such that for each 6 there is a 


p=1, 2,---+, for which | f(zby,) —f(zb,y,)| Se for all z and for every 
v=1,2,---,m. Our two inequalities together give the result that 

|f(2by) —f(2b,y)| 
Finally, f(x) is l.a.p., so that there exists a finite set of elements ai, - - - , a; 
such that for every a there is a X=1,-- - , 1 for which | f(au) —f(a,u)| <e 
identically. This, together with our last inequality, implies that |f(axby) 
—f(a,xbyy)| identically. 

As this holds for every ¢>0, the c.c. character of the set of functions 
f(axby) is proved [13, pp. 108-109]. 

THEOREM 10. If f(x, x’) is a.p. in GG, it is also a.p. in © as a function of x 
or as a function of x’. Thus we can form M,f(x, x’) and Mi f(x, x’) which are 
a.p. in & as a function of x’ and as a function of x, respectively. Thus we can 
form Mz [Mf(x, x’)] and M.[Muf(x, x’)]. These expressions are both equal 
to Mzf(x, x’). 

The first statement is obvious. In the second and third statements it is 
sufficient to consider M.f(x, x’) and M.[M.f(x, x’)], as interchange of x 


i i 
§ 
a 
{ 
| 
4 


456 J. v. NEUMANN [July 


with x’ and of f(x, x’) with f(x’, x) leads to the rest of the theorem. 

Consider a sequence 4, dz, - - - of elements of @. As f(x, x’) is r.a.p., the 
sequence f(x, x’a:), f(x, x’a2), - - - contains a uniformly convergent subse- 
quence f(x, x’a@n,), f(x, x’dn,), such that, for every. and almost all 
wand », |f(x, x’an,)—f(x, x’an,)| Se. This implies that |M.f(x, x’an,) 
—M f(x, x'a,,)| Se. Thus the set of functions of x’, M.f(x, x’a), is c.c. There- 
fore M.f(x, x’) is r.a.p. and interchange of ab in @ with ba shows that it is 
l.a.p. Hence it is a.p. 

Now it is obvious that M’,,.f(x, x’)=M.[M.f(x, x’)] has Properties 
(1)—(4) and (7) enumerated in Theorem 7 if we look at it as an [x, x’ |-mean. 
Therefore we may conclude from Theorem 8 that it is M.2f(x, x’). 

Theorems 9 and 10 may be extended by iterating them m times to func- 
tions of 2” variables; by choosing 2”=m and taking the functions constant in 
the last 2"—™m variables these theorems may be extended to functions of n 
variables. 


II. APPLICATION OF THE METHOD OF WEYL AND E. SCHMIDT 
PROOF OF THE FUNDAMENTAL THEOREMS 


8. The results of Part I enable us to apply the method of Weyl to the 
proof of the fundamental theorems of Bohr’s theory of a.p. functions (in the 
addition group of real numbers) and to the discussion of the linear-orthog- 
onal representations of continuous groups.* The present part, II, contains a 
proof of “Parseval’s formula” (equivalent to Theorem 15), which runs ex- 
actly along the lines of Weyl’s proof. It also contains the proof of the “ap- 
proximation theorem” (equivalent to Theorem 18) where a different device, 
due to N. Wiener, has to be used because of the difficulties of constructing in 
our general case an a.p. function with the required properties (cf. [31, pp. 
348-349 |, and our Theorem 17). The next part, III, contains an interpreta- 
tion and application of these theorems connecting the theories of a.p. func- 
tions and of representations. In this, Weyl’s method is of fundamental 
importance. 


DeFIniTION 6. If f(x) and g(x) are a.p., we set 
h(x) = My[f(xy)g(y)] = = X 


We observe that the two expressions for h(x) are equal by Theorem 7 and 
Properties (7) and (9), after making the substitution y~'x for y, and that 
h(x) is a.p. by Theorems 9 and 10. 


* Cf. H. Weyl [31], H. Weyl and F. Peter [32]. The operational methods used there are partly 
based on the thesis of E. Schmidt [20]. 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 457 


Remark. fXg can be uniformly approximated by functions of the form 
vif (x01) + - (1, - Yn are complex numbers), that is, for every 
e>0 there exist numbers -- , Yn and elements (1, - , Cn Such that 

| f X g(x) — vif(xer) — — yaf(xen)| Se 


holds throughout ©. 


g(x) is a.p. and therefore bounded (Theorem 2); suppose g(x) <C. Now 
choose a 6>0. According to our remark in the proof of Theorem 3, it is pos- 
sible to find a finite number of elements },, - - - , b, of G such that to every 
« there exists a x=1, - - - , k for which | f(xz) —f(b,2)| <6 holds identically. 
Now consider the a.p. y-functions f(),y-!)g(y), kx=1, ---, &, and apply to 
them the Remark following Theorem 5 (with ¢/2): if €>0, there exists a set 
of real numbers au, - - , @n (a1, @, each 20,ai:+ +a,=1) anda 
set of elements a, - - , d, of such that 


| +--+ + g(yan) — | 
holds identically for every x=1, - - - , k. For every x there exists a x such that 
| f(xz) —f(b.2)| <6 and therefore such that 
| f(au-*)g(u) — and 
| — Co. 
Hence we have the result that 
| + + — My[f(xy-)g(y)]| 


€ 
206 


Our statement is proved if we put 6=«/(4C) and y=1, and substitute 
- @ng(@n) for y1,---, Yn and fora, ---, Cn. 


THEOREM 11. The “multiplication” f Xg is distributive (linear) in both fac- 
tors, associative, and if & is Abelian, commutative. 
The theorem is obvious except for associativity. Our second form for 
h=fXg gives 
(f X g) X k(x) = 
= J], 
IX (g X = 12) 


it} 

ba) 

| 


458 J. v. NEUMANN [July 


and these expressions are equal by Theorems 9 and 10. 


DerinitTI0n 7. If f(x) is a.p., we denote fXfX --+ Xf (n factors) by f*, 

and f(x-1) by f’(x). Furthermore, we define 
Nf = {M.[| f(x) 

THEOREM 12. Let f(x) and g(x) be a.p. functions. The following formulas 
hold: 
(1) If f¥0, Nf>0. 
(2) Nf, N[ftg]sNf+Neg, (Nf) (Np). 
(3) ff'(1) =f'F(1) = (NP)?. 
(4) | M.f(x)| 
(5) |fxg(x)| (Wg). 
(6) ]S(NP) (Ng). 

Statements (1), the first part of (2), and (3) are obvious. The second part 
of (2), after being squared, means that 


M,[| f(x) + g(x) |*] M.[| f(x) + g(x) |2] + 20g), 
| M < (WA) (We). 


This obviously follows from (6). The third part of (2) again follows from (5) 
by squaring and applying M,. (4) follows from (5) by putting g(x) =1, since 
{x1(x) =M,[f()]. 


Hence we need to prove only (5) and (6). Since 
it follows that 
| | 


3M,[| f(y) + | |?] 
4M,[| f(y) + |?), 
X g(x) | + 


= 


Starting from 
| f(x) || g(x) | S +3] 
we obtain similarly 
M.[| f(x) | |g(x)|] + 40 
If we replace f and g by yf and g/y (y real and >0) we see that | fxg(x)| 
and M,[|f(x)| | g(x)| ] do not exceed 


+ — (We)? 
2? Ng)?. 


* 
3 
j 
q 
3 
3 
f 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 459 


The greatest lower bound of this expression is (Wf) (Ng). This completes the 
proof of (5) and (6). 


THEOREM 13. Let f(x) be an a.p. function 40. Put 


(n factors) 
f)"1) = X fy"(1). 


First we prove that the four expressions above for I’, are equal. Indeed 
the first and third expressions for I’, are equal, and so are the second and the 
fourth, since (g Xk)’ =h’ Xg’. The equality of the third and fourth expressions 
follows from 


and from 
gl) =¢ X f(t) = 


By (5) of Theorem 12, |fxg(1)| (Nf) (Ng). If we replace here f and g by 
{xf'X + -withn—1and n+1 factors respectively, we obtain 
And if we replace f and g in (2) of Theorem 12 by fxf’X ---orf’XfX --: 
with m and m factors respectively, we obtain l,.4,5T.,..,. That T,20 is 
obvious; but the condition [,=0 would imply that [,1=0 (because 
provided that so that This means that 
NIfxf']=0, hence N[f]?=fx/’(1) =0 (that is, =0) and f=0, 
contrary to our assumption. Thus we have I’, >0. 


THEOREM 14. Let f(x) be as before, and define T:, T2, - - - as before. Then 


as n>, 


Cf X f’)"(x) 


— $(x) (uniformly). 


Furthermore, $(x) is a.p., and ¢’=$, Xo 
=oXfxf' =7¢, o(1) =x. 
The formulas of Theorem 13 imply that 
T3 
0<—s-—s 
Tr, 


and therefore I',4:/I’, has a limit y as n>, 0<y<I;. Furthermore, 


{ 
| 
: 
: 
A 


J. v. NEUMANN 


that is, 

Tr Tr 

an ur 0, 

+ 
and therefore has a limit x x20. Finally we have 
that is, 


Pntn—1 Pm 


The limiting process m—>~ shows that l,,27” and l’,/y"21, and then the 
limiting process n> shows that x21. 
By (4) and (2) of Theorem 12, 


Y 


) xs) 


™m 


Y 
x fy") X 4 (f’ X 


y™tn 


As m and n— the last expression converges to 0; thus the first expression 
converges uniformly to 0, that is, as n>, (fxf’)"(x)/y" converges uni- 
formly to a limiting function ¢(x). As the functions (fXf’)"(x)/y" are a.p., 
$(x) is also a.p. 

The relations 

(f f’)” x (f f’)" 


{xf xX 


show that, when becomes infinite (the convergences involved all being uni- 


460 (July 
_ 
7" 
(22 Pmtn—2 ‘ =), 
| 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 
form), 6X¢=¢, Finally, 
(f X 


(1) 


and therefore ¢(1) =k. 

THEOREM 15. Let f(x) be as before. Then there is a (finite or infinite) sequence 
of real numbers 1, Y2, - and a sequence of a.p. functions d2(x), 
all 0, such that yi>¥2> >0, GnXbn=Gn, Gn(1) 21, 
bmXGn=0 (m¥n), Xbn=bnXf Xf’ =Ynbn, and + 
converges uniformly to f Xf'(x). 

Apply Theorem 14, and put there y = 1, 6(x) = ¢:(x). ¢:(1) =x21 proves 
that ¢:(x)40. Now put f*=f—¢:Xf. Then f*(x) is a.p. and 

XK o) 
=f Xf’ — — + W191 X G1 = f X — 191. 
If f*=0, this shows that f Xf’ = y:¢1, that is, the Theorem holds for a'sequence 
consisting of one element. Assume that f*#0. 

Then implies that (f*x/*’)” =(fx/’ — 

=(fxf’)”—vy"¢u, and thus 


Therefore if we form y =y2 and ¢2(x) of Theorem 14, for which we have 


— oi(x) ~Oasn- 


X f*)"(z) 


72 


$2(x) 0, 


then it must be the case that y1>72. 

By repeating this process with f** =/*—¢.x/*, f*** =/**—o;xf™, --- 
we finally find a sequence of real numbers 71, 72, - - - and two sequences of 
a.p. functions ¢:(x), ¢2(x), - - - and f(x), f*(x), - - - with the following prop- 
erties: 

>0, on = Gn, On X On = Gn, $n(1) = 1, 
XK bn = bn SOY XK = Yada, = — {M, 


these sequences ending when an f becomes =0, otherwise never ending. 
These rules again imply the relations f™ xXf™’ —yadn. 
By adding these relations for all »=1, - - - , p, we obtain 


(*) + =f Xfi — f™ XK for’. 


461 
wt 
i 
= 
i. 


462 J. v. NEUMANN [July 


We now wish to prove that ¢,, X¢, =0 for mn. Application of (’) shows 
that it is sufficient to consider m>n, that is, it is sufficient to prove that 
Gn+k+1 =0 for k=0, 1, 2, - - - . Consider the equation f(*+» xfi"+»’x@, 
=(0. For k =0, the condition 


obtains. If it holds for a certain k=0, 1, 2, - - - , we have 


(fist ® x fier’ Xn) =(Q, 
Xft’ XK bn = 


and thus =0. This gives 


that is, our equation holds for k+1. Therefore it holds for every k=0, 1, 
and with it, its consequence =0. 
Application of ¢,X ---or --- Xd, to (*) with p=n—1 gives 


gn xfxf’ =n x =YnGn, Xn Xon=YnPn- 


The only thing remaining to be proved is the uniform convergence of 
+ +Ynbn(x) to fXf'(x) as or, according to («), the uni- 
form convergence of f™ xf) (x) to 0. Now («) implies, by (3) and (5) of The- 
orem 12, that yidi(1)+ - =(Nf)?—(Nf™)? and since 
| onXon(x)| SbnXGn(1), that is, |¢,(x)| <¢,(1), (*) implies the uniform 
convergence of 7:¢1(x) +y2¢2(x) + - - - to g(x) as m—>%, where g(x) must be 
a.p. Hence f™ xf™’(x)—f Xf(x) —g(x) uniformly. Moreover, the above men- 
tioned convergence implies that y,—0 (because ¢,(1) =1). On the other hand, 
we have (N[fxf’ If we replace f and y by 
f and y, we obtain (NV [f™ Thus 


implying that N [fxf’ —g]=0 and g=/x/’. 

9. Having reached the final result of the E. Schmidt-Wey] theory, we now 
pass to the approximation theorems. But as we have already mentioned, we 
are now giving only their proofs and shall discuss their real meaning in the 
next part. 


> 
4 
g 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 463 


DEFINITION 8. An a.p. function $(x) such that 6’=¢ and ¢Xo=¢ is 
called a unit. Two units o(x) and p(x) such that ¢Xp=0 are called orthogonal. 


THEOREM 16f. For every a.p. function f(x) and every «>0 there exists a unit 
o(x) such that N[f—¢xf] Se. 


If f=0, then ¢=0. Hence we may assume that f#0. Then apply Theorem 
13. --- +o, is a unit (because bn =0 
for mn), and we have 


(N vex Le xs) x(r- x (1) 


v==1 


By 


= (sx (1) as 


v=1 


Thus ¢=y,, for a sufficiently large m, yields the desired result. 


THEOREM 17. For every a.p. function f(x) and every e>0 there exists an a.p. 
function g(x) such that | f(x)—gxf(x)| Se for every x. 


Following N. Wiener, consider the “translation function” of f(x), 
e(x) = Lub.y| f(aty) — f(y) |, 


which was introduced by S: Bochner [2]. As f(x) is a.p., it is easily seen that 
e(x) is also a.p. Furthermore, e(x) =0, e(1) =0. Now define the function 


u 
OS 
F(u) = 


0, u2e. 


As F(u) is continuous, ¢(x) =F(e(x)) is a.p. It is obvious that ¢(x) =0, 
=1; and if l.u.b.,| —f(y) | >«, then ¢(x) =0. Thus ¢(x) +0 implies 
that | f(x-"y) —f(y)| <e, and therefore, always | ¢(x) (f(a-y) —f(y))| <€(z). 
Consequently, | M.[$(x) (f(x-1y) —f(y)) j| Now M.[$(x) (f(a-y) 
—f(y)) ]=¢xf(y) —f(y) M(x), and therefore the function 


+ Cf. Theorems 28 and 29, where the results of Theorems 16 and 18 will be interpreted and ap- 
plied. 


j 
4 
2 
, ! 
| 
| 
i 
¢ | 
4 
if 


J. v. NEUMANN 


$(y) 


meets the requirements. 


THEOREM 18. For every a.p. function f(x) and every e>0 there exists an 
a.p. function g(x) and a unit (x) such that | f(x) -—¢XgXf(x)| Se for every x.t 

Choose the function g(x) of Theorem 17 corresponding to f(x) and ¢/2, 
and choose the function ¢(x) of Theorem 16 corresponding to g(x) and 
«/(2Nf). Then | f(x) —gxf(x)| S¢/2, SN Nf 
<«/2, and therefore | f(x) -—¢XgXf(x)| <e. 


III. THEORY OF LINEAR REPRESENTATIONS OF 


10. We define the representations in the usual manner: 


DEFINITION 9. If to every ac & there corresponds a matrix 
D(a) { Dyo(a) } (ep, 0 = , 5) 


of degree s such that D(1) =1, (the unit matrix of degree s), D(ab) = D(a) -D(b), 
then we call D(a) a representation of &. (No continuity is assumed.) Two repre- 
sentations D(a) and D'(a) are called equivalent if they are of the same degree s, 
and if a fixed matrix U={U,.}, p,0=1,-- +, Ss, exists which transforms one 
representation into the other: U-\D(a)U=D'(a). A representation D(a) is 
called reducible (completely reducible) if it is equivalent to a representation D’(a) 
such that Dj,(a)=0 identically (in a) whenever pSt, o>t (pSt, o>t or 
p>t, oS)t, for a fixed value of t, 1St<s—1. Representations without these 
properties are irreducible (completely irreducible). 


For finite groups © Frobenius and Schur gave a complete theory of all 
representations [21, 22]; for continuous groups © close analogues of their 
results were established by Schur for the rotation group in three dimensions, 
and in much broader generality by Weyl for all compact Lie-groups [30]. 
These results were extended to all compact groups G by Haar [11, pp. 166- 
169] with the help of his notion of “right-invariant” Lebesgue measure in 
groups. We shall push the extension further to all groups G, but in order to 
do this it is natural and necessary to restrict the domain of representations 
of @ by means of 


Tt Cf. footnote to Theorem 16. 
t These are the fundamental notions of the Frobenius-Schur theory of group representations 
[21, 22, 30]. 


464 [July 
| 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 465 


THEOREM 19. The following conditions on a representation D(a) of © are 
equivalent to each other: 

A. D(a) is equivalent to a unitary{ representation. 

B. All elements D,.(a) of D(a) are bounded. 

C. All elements D,.(a) of D(a) are a.p. 


If D’(a) is unitary, then, by the footnote just cited in the case where p =o, 


Der (a)|?=1, Dor (a)| $1, 

and all D,,/(a) are bounded. Therefore the elements D,,(a) of any D(a) 
equivalent to D’(a) must also be bounded. Thus A implies B. 

If all D,.(a) are bounded, every sequence D,,(a,), m=1, 2, - - - , contains 
a subsequence which converges for all p, c=1,---, s. And then, since 
D(xa,) =D(x)D(an) and D(a,x) =D(a,)D(x), the representations D(xa,) and 
D(a,x), and hence all D,,(xa,) and D,,(anx), converge uniformly. Thus all 
D,.(a) are a.p., that is, B implies C. a 

If all D,,(a) are a.p., so are the expressions )-5_,D,,(a)D,-(a), and we 
can form 


Ape = > Dyr(x)Dor(x) |. 


t=1 


t=1 t=] 


and for every system &, - - - , & which is not identically zero, 


r=1 r=1 1 p=1 


Aye = Aep and pte > 0. 


Therefore the matrix A ={A,,} is Hermitian and positive definite. Hence 


t That is, to a representation D’(a) in which all matrices D’(a) = {Dj0(a)}(p, e=1,- ++, 5) are 
unitary. A matrix U={U,¢} is called unitary if its adjoint U*= { Up} is reciprocal to it, that is, if 
UU*=U*U=1,, or more explicitly, 


DU Ver = = = 


T=1 T=1 


0,p 


‘ 
Now 
so that 
4 8 
t i 


466 J. v. NEUMANN [July 


there exists a matrix X = { X,,} such that A =X X*. On the other hand, 


Aye = | = M, | 


t=1 


r=1 


> Dayo | > | 


p’ t=1 


p’ 

that is, A = D(a)AD(a)*, or XX* =D(a)XX*D(a)*, 
=1, (X-'D(a)X) (X-'D(a)X)*=1. In other words, the equivalent repre- 
sentation X-'D(a)X =D’(a) is unitary. Thus C implies A. 

Our three statements together prove the equivalence of A, B, and C. 

DEFINITION 10. We call normal the representations satisfying one of the 
equivalent conditions of Theorem 19. 


11. The fundamental theorems of the theory of orthogonal representa- 
tions may now be proved in the classical way [21, 22, 30, 32]. 


THEOREM 20. Let D(a) and E(a) be completely irreducible normal repre- 
sentations of degrees s and t respectively, and let A be a rectangular matrix with 
s rows and t columns. If D(a)A=AE(a) for every a, then either A=0 or s=t 
and det A 0, the latter alternative of course implying the equivalence of D(a) 
and E(a). If D(a) =E(a), then A =a (a being a complex number). 


In all these statements (except the last) D(a) and E(a) may be replaced 
by two equivalent representations. Therefore we may assume them to be 
unitary. Even then further transformations by unitary matrices X and Y 
are possible. They carry A into A’=X~-!AY. Now by such transformations 
we can obtain A’= {A,,’}, such that 


p = 


0, for all other p and a, 


where 7 is the rank of A’ and rSs, rS<t. Therefore we may assume that A it- 
self has this form. 

Under these conditions the relation D(a) A = A E(a) implies that D,,(a) =0 
for p>r and o<r, and that E,.(a) =0 for p<r and o>r. Since A* also has 
the form we assumed for A, D(a)* = D(a)-! = D(a“), E(a)* = E(a)' = E(a“), 


| 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 467 


we get, by applying* and replacing a by a~, A*D(a)=E(a)A*, so that 
D,.(a) =0 for pSr and o>r, and E,,(a) =0 for p>r and o<r. Thus the com- 
plete irreducibility of D(a) requires that r be 0 or s, and the complete irre- 
ducibility of E(a) requires that r be 0 or ¢. Hence either r=0, in which case 
A =0, or r=s=t, in which case det A £0. 

If D(a) = E(a), every A —al has the same property as A. If a is a root of 
the characteristic equation of A, we have det [A —a1]=0, so that our alter- 
native requires that A —al =0, and A =al. 


THEOREM 21. Let D(a) and E(a) be completely irreducible normal repre- 
sentations of degrees s and t respectively. If they are inequivalent, we have 


Dye X = 0. 
Considering D(a) alone, we have 
foro 
Dye X Dev( = S 
0 fora#r. 
Form the (rectangular) matrix 
A = {Ar}, Are = Doe X 


for a given choice of p, v, and x. Then 


= = J, 


o’=1 o’=1 
= = Mz | 


(the variable y being replaced by z=xy—! and z= ya respectively), and there- 
fore 


E(a)A = AD(a). 


Thus we can apply Theorem 20. If D(a) and E(a) are inequivalent, it results 
that A =0, andif D(a) = E(a), A =a,,(x)1, (a(x) being a complex number). 
This implies (1) that A,,=0, that is, D,,.XE,.(x) =0 if D(a) and E(a) are 
inequivalent or if D(a) =E(a) and o¥7; and (2) that A,,=a,,(x), which is 
independent of o, if D(a) =E(a) and «=r. Hence all the statements of our 
Theorem are proved if we show that a,,(x) =(1/s)D,.(x). This follows im- 
mediately from 


& 

th 

al 

4 

: 
; 
{ 
4. 
‘ 
wg 
% 


J. v. NEUMANN 


o=1 


= M,D,.(xy~'y) = M,D,.(x) = D,.(x). 


THEOREM 22. For normal representations reducibility and complete reduci- 
bility are equivalent, so that irreducibility and complete irreducibility are also 


equivalent. 

That complete reducibility implies reducibility is obvious. Assume now 
that D(a) is reducible without being completely reducible. As we can replace 
D(a) by any equivalent representation, we may assume D(a) to be in the form 
described in Definition 9. Then there would be a pair of indices p and o such 
that D,,(x)=0. By Theorem 21, this relation implies that D,, X D,,=D,, =0, 
in spite of the fact that D,,(1) =1. Thus D(a) must be completely reducible. 

12. Our next task is to formulate the connection between the units of 
Definition 8 and representations. This is accomplished by means of 

THEOREM 23. For every unit $(x) there exist a number of inequivalent irre- 
ducible unitary representations D™(a),---, D™(a) of degrees 5,---, Sy 
respectively such that 

(w) 


(x) = > Sw > Oe Doo (x). 


w=1 p.o=l 
Here every matrix a) = {a} is idempotent, that is, a* =a (cf. footnote 
on page 465), (a)? =a.t Conversely, every o(x) whichis formed in this way 


(where D(a) and a satisfy our conditions) is a unit. 
By a suitable choice of D(a) we can give the matrices a the form 


(w) 
Apo 


=1,---,S., 


0 for all other p and o. 

Consider the (a.p.) solutions f(x) of the equation ¢xXf=f. Assume that 
it is possible to find s solutions gi, - - - , g. among them which satisfy the con- 
ditions 
1 for wu = », 


Su X (1) = M ,[gu(x)g.(x) | 


Put 


That is, 


ai) = 


468 [July 
— Sw 
Le? 
C- 


ALMOST PERIODIC FUNCTIONS IN A GROUP. I 


Hs, = 0079 Dees. 


Then 


By Theorems 9 and 10, the orthogonality properties of g,, and the relations 
= fu, this turns out to be equal to 


M,[| ¢(x) |?] —s—s+s=(NO)?-s. 


Therefore we have (V¢)?—s=0, so that s<(N@)?. Thus the possible numbers 
s are bounded, and they have a maximal value. Assume that s is this maximal 
value and choose gi, - - - , g, accordingly. If a solution f(x) of ¢xf=f is such 
that fXg/ =0 for all w=1,---, s, then necessarily f=0, for otherwise we 
could put g.4:=f/(Nf), implying that Ng,,:=1, that is, g.41Xg 4:(1) =1, and 
£041 (1) =0 as well as =0 for - - - , s, so that 
x gf (1) 41 
0 for up », 
contradicting our assumption that s is maximal. 
If we define f(x) to be (x, a), we find by a simple computation that 
oxXf=f and fXg, (1) =0. Therefore f(z) = =0, ¥(x, a) =0, and, as a was arbi- 
trary, ¥(x, y) =0, that is, 


(t) 


As (f) implies that 
every solution of ¢Xf=f has the form })} _,,g, (a, being complex numbers). 


It is obvious that f(xa) is a solution along with f(x), so that g,(xa) is a solu- 
tion, and we can write 


= 


if 
4 
if 
1934] 469 
| ii 
t 
9 
p=1 
rT 
4 


470 J. v. NEUMANN [July 
The orthogonality properties of g, determine the coefficients D,,(a) uniquely: 
Dyo(a) = M.[g.(xa)g,(x) ] = X g(a). 


Hence D(1) =1,, and if we put D(a) = {D,.(a)}, p,o=1, ---, 5, we obvi- 
ously have D(ab) = D(a)D(6), that is, D(a) is a representation. As ({) holds 
if we replace x and y by xa and ya, that is, if we replace g,(x) and g,(y) by 
g,(xa) and g,(ya), the transformations D(a) must be unitary. 

All this implies that 


¢(x) = x) go(1) = 


o=1 p,o=l 


so that (x) is a linear aggregate of all D,,(x). Now (cf. Theorem 22) D(a) 
can be transformed into an equivalent D’(a) which consists of a certain num- 
ber of irreducible representations D(a), - - - , D(a) (of degrees 51, - - - , Sy 
respectively, where s; + - - - +5,=s) which succeed each other along the 
main diagonal, and zeros in all other places. Therefore ¢(x) is also a linear 
aggregate of the elements D,,’(x), that is, of the elements D® (x). Now if 
some representation D(x) is equivalent to another representation D® (zx), 
the elements D‘*)(x) are linear aggregates of elements of D‘(x). Therefore, 


in expressing ¢(x) as a linear aggregate of all elements Fens (x), it is sufficient 
to keep only one member of each class of equivalent representations D(x). 
Those representations thus kept may be labeled D(x), - - - , D(x), uSv. 
So we finally have the result that D(a), -- - , D/™(a) (of degrees 51, - - - , Su 
respectively) is a set of inequivalent irreducible unitary representations, 
and ¢(x) is a linear aggregate of the elements D(x), that is, we can write 
(x) in the form 


= aye Dye (x). 


w=1 


If by means of this equation we now determine the meaning of ¢’=¢ 
and ¢X¢=4¢, remembering that 


w) 


= = D(x"), 


(w) (w) -1 —1 (w) 
= Dos ), Doo (% ) = Day (x), 


that is, D(x) (x), and 


op 


ALMOST PERIODIC FUNCTIONS IN A GROUP. I 


1 (w) 
ifw=xando=r 


0 for all other w, x, p, o, 7, v, 


we obtain exactly the conditions in our Theorem. Furthermore it is clear that 
every matrix a® = {a}, being idempotent, can be transformed into the 
form given at the end of our Theorem. And the inverse transformations of the 
representations D“)(x), which carry them into equivalent representations, 
bring about just these a-transformations. 

13. We choose a system of “representants” for the inequivalent irre- 
ducible (normal or orthogonal) representations of G: 


DeriniTIon 11. Let I be the set of all irreducible normal representations of 
@. Call each subset € of I which consists of all the elements of I equivalent to one 
of its elements a class. It is obvious that every element of I belongs to exactly one 
class. Call the set of all classes C. Each element © of C contains unitary represen- 
tations (since every normal representation is equivalent to a unitary represen- 
tation). Select one unitary representation from each © of C, call it the represen- 
tant of ©, denote it by D(a; ©)t, and denote its degree by s(G). 


THEOREM 24. The (a.p.) functions D,,(x; ©), Cin C, pando=1, ---, s(G), 
have the property that 


1 
D(C) X = for€ =D and o = 7, 


0 all other ©, D, p, 7, v. 
This implies that 


1 
M .[Dye(x; D] = for p v.¢ 


0 for all other ©,D, p, o, 7, v. 


The first formula has been proved in Theorem 21. The second formula 
follows from it if we put the variable equal to 1 and remember that 
D,.(D)’ =D.(D), DA, D) =1. 

Thus the functions s(€)/? D,.(x; ©) form a “normalized orthogonal” sys- 
tem. This is the basis for the formulation of the usual expansion theorems. 
The key theorems of the theory will be proved as Theorems 28 and 29. 


t This does not imply an essential use of the “axiom of choice” because it would in most cases be 
possible to characterize a D(a; ©) in € in a unique way. To abbreviate we shall also use the notation 
DG) omitting the argument a. 


1934] 471 
| 
3 
J 


472 J. v. NEUMANN 


DEFINITION 12. If f(x) is an a.p. function, the complex numbers 
= f X Dyo(1; 6)’ = Mel f(x)D,-(x; ©] 


are called its expansion coefficients. The matrices &(©) = {&,.(€)} are called 
the expansion matrices. 


THeoreEM 25. If f and g have the expansion matrices &(€) and B(G) (€ 
running over C), f +g, Of (0 any complex number), f’ and f Xg have the expansion 
matrices +B(G), 0a(C), &(C)*, and &(C)B(C) respectively. A unit is 
characterized by the fact that all its expansion matrices are idempotent, so that 
only a finite number of them can be #0. 


The statements concerning f +g and 6f are obvious; as to /’ it is sufficient 
to remark that 


M,[f'(x)D,-(x; ©] = ©] 


M f(x) D,o(x; €)’] 6) ] 


The following computation proves our statement with regard to fXg (cf. 
Theorems 7 and 8): 


X g(x)Dyo(*; ©)] = ©) ] 


©) ] 
DX ©) ©) 


T=1 
D> ML ©) ] 


8 


T=1 


This discussion shows that the idempotence of all expansion matrices a(€) 
is characteristic of units ¢, but it follows from Theorem 21 that all those 
matrices which have a D(a; €) not identical to a D(a) forsomew=1,---, u 
mush vanish, so that only a finite number are different from zero. (We here 
use the fact that if two functions have the same expansion matrices, they 
coincide; that is, if all expansion matrices of a function f vanish, then f(x) =0. 
This follows from Theorem 28 (the proof of which does not depend upon 
Theorem 25) by putting f(x) =g(zx).) 


TuEorEM 26. If f has the expansion matrices &(€) = {&,.(€)} and if 
Gi, ---, ©, are elements of C, then of all linear aggregates g of the elements 
w=1,---, 2; p,0=1,---, S(C.), which can be written in the form 


[July 


ALMOST PERIODIC FUNCTIONS IN A GROUP. I 


n 
g(x) = x; Cu) 
wml p,oml 
that one which minimizes the expression (N[f—g])? is characterized by the 
property that Cs) = The value in question is 


(Nf)? — | |? 


p,o=ml 
The proof is contained in the well known computation 


(N[f — g])? = M.[| f(x) — g(x) |}? 


= M,|| f(x) |*] — ML — + g(x) |?) 


p,o=ml 


s(Cx) 
+ s(€.)s(Cx) Mz [Dyo( x; C..)Drv(x; | 


@,X=1 
n 


w=] 


w=] p,o=ml p,o=ml 


wml 


THEOREM 27. (Bessel’s inequality.) If f has the expansion matrices (@C) 
= {&,,(€)}, the number of those © for which &(€) ¥0 is at most countably in- 
finite, that is, it is possible to arrange them in a finite or infinite sequence 
---. Then we have 


p,oml 


Since, for all other ©’s, &,,(€) =0, we can instead write 


(Wf)? = ©) DX | |*. 


p,o=l 


f 
1 
1934] 473 
j 
— 
| 
ff 
A 
is 


474 J. v. NEUMANN 


The number of G@’s for which 


is, because of the last statement of Theorem 26, certainly finite and <(Nf)?/e. 
Putting «=1, 3, 3, - - - successively, we see that, with at most countably 
infinitely many exceptional @’s, it is always the case that 


s(©) | |? = 0, 


p.o=l 


that is, &,,(€) =0. This proves the first statement of our Theorem. 
For every m the last statement of Theorem 26 shows that 


‘ 


w=] p,o=ml 


Hence the sum 


D> ss) do | &o(C.) |? 


p,o=ml 
is convergent (if the number of terms is infinite) and <(Nf)?. 


THEOREM 28. (Parseval’s equation.) If f and g have the expansion matrices 
= } and = {8,.(€) }, then 
M .[f(x)g(x)] = » SO) 
where the series >\ contains at most countably infinitely many terms £0, 
and is absolutely convergent (if infinite at all). 


If this is proved for f=g, we obtain the real part of the statement by re- 
placing our f by (f+g)/2 and (f—g)/2 and subtracting. Replacing f and g 
by if and g gives the imaginary part and completes the proof. Hence we may 
assume that f=g, that is, we must show that 


(Nf)? — s(6) | |? = 0. 


If we take all finite subsets of C for the G, - - - , ©, in Theorem 26, we 
see that the left side of the above equation is the greatest lower bound of all 
(N[f—g])? if g is any linear aggregate of any finite number of elements 
D,.(€). Hence we need to show merely that it can be made Sé? for any e>0. 
This is accomplished by choosing the unit ¢(x) according to Theorem 16, be- 


[July 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 475 


cause ¢Xf is a linear aggregate of a finite number of elements D,,(€). By 
Theorem 23 we need to prove this only for the elements D,,(D) Xf and it 
follows that 


D,(D) X f(x) = M,[D.(xy; D) f(y) ] 
8(D) 
M, [D.ar(x; D) f(y) ] 


A=1 


A=1 


THEOREM 29. (Approximation Theorem.) For every a.p. function f(x) and 
every €>O there exists a linear aggregate h(x) of a finite number of elements 
D,o(x; ©) (which can be limited to such elements for which the expansion matrix 
&(€) of f is #0) such that | f(x)—h(x)| <e for every x. 


By Theorem 18 we may put h=¢ Xf Xg, so that we need to prove merely 
that ¢ Xf Xg is a linear aggregate of the desired kind. By Theorem 23 we may 
consider D,,(D) XfXg. The last formula of the preceding proof shows that 
this is a linear aggregate of a finite number of elements D,,(D). This formula 
gives for the coefficient of D,(D) in D,.(D) XfXg (replace its f by fXg) 


D)f X g(y)] = X g(y)] 
= M,/f X g(y)Daly;D)]. 


Thus it is the expansion coefficient of D,,(D) in fXg, and this is equal to 
zero if the expansion matrix of D,(D) in f is zero (cf. the statement of 
Theorem 25 concerning f Xg). 


THEOREM 30. Each a.p. function is the limit of a uniformly convergent se- 
quence of functions each of which is a linear aggregate of a finite number of ele- 
ments D,.(€), and conversely. 


The statement follows from Theorem 29 by putting e=1, 4, 3,--- in 
succession. The converse statement is a consequence of the a.p. character of 
all elements D,,(G). 


IV. ALMOST PERIODICITY AND CLOSED FAMILIES OF FUNCTIONS 


14. Parts I-III give a fairly complete theory of a.p. functions in an arbi- 
trary group G, absolutely free from the customary restriction of continuity. 
We now introduce restrictions of this type, but in a more general manner, by 
considering certain families of functions. 


t 
| 
id 
oe] 


476 J. v. NEUMANN [July 


DEFINITION 13. A set S of functions f(x) (defined in © with complex num- 
bers as values) is called a closed family (cl.f.) if it has the following properties: 
(1) If f(x) is in S, every f(xa) is in S. 

(2) If f(x) is in S, every f(ax) is in S. 

(3) If f(x) is in S, every af(x) is in S. 

(4) If f(x) and g(x) are in S, f(x) +g(x) is in S. 

(5) If f(x), folx), --- are in S and if f,(x) converges uniformly to f(x) as 
n— 2, then f(x) is in S. 


THEOREM 31. If S is a cl.f. and contains either f or g, then it contains f Xg; 
if D(a) ={D,.(a)} is an irreducible normal representation, S contains every 
D,« if it contains one D,.; if the system of representative irreducible normal repre- 
sentations D(a; ©) is given (cf. Definition 11), and if S contains f, then S con- 
tains all elements D,.() of every whose expansion matrix (cf. 
Definition 12) is #0. 

Therefore Theorems 28 (Parseval’s equation), 29 (Approximation 
Theorem), and 30 remain true if we restrict ourselves throughout to functions 
in S. 

If f belongs to S, fXg belongs to S by the Remark following Definition 6. 
The case where g belongs to S can be reduced to the case where f belongs to S 
by replacing ab by ba in G. If a D,, belongs to S, every D,-,. belongs to it, 
since, by Theorem 21, X Doo X Doo = Finally, 


= © |Dro(x; ©) = > ©) ©), 


t=1 


that is, 


f X = YX 


1 
Dyo(G) x f x 


Hence if &(€) ~0 for a given G, that is, if any &,-,.(€) 0, and if f belongs to 
S, then fXD,-.(€), Xf XD.-.(©), and D,.(©) in turn belong to S. 


If we keep these facts in mind we see that the proofs of Theorems 28, 29, 
and 30 still hold in S. 


q 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 477 


DEFINITION 14. If a topology T is given in Gt we denote the set of all T- 
continuous functions by [T]. 


THEOREM 32. If a topology T is given in Gt in which ab is continuous in a 
for a fixed b, and in b for a fixed a, then [T | is a cl.f. 

The statement is obvious. 

The a.p. functions of a cl.f. S are determined (in the manner described in 
Theorem 30; cf. the last statement of Theorem 31) by the elements D,,(G) 
belonging to it. This greatly facilitates the determination of all a.p. functions 
of a given cl.f.S. 

15. We shall discuss some examples in detail. 

EXamP_e 1. Let G=G,,, be the set of all rational numbers with addition 
as the rule of composition. As the group is Abelian, all irreducible representa- 
tions are of degree 1, D(a; ©) = {D,.(a; ©)}, p, ¢=1, so that we have a single 
element Du(a; ©) =¢(a; ©). The fact that this is a unitary representation is 
expressed by the relation 


(*) = g(a +6), @(a)| = 1. 


Every rational number can be written in the form a=m/n! (m=0, +1, 
+2,---;m=1,2,---). Now put 


1 
(5) = 05%, <1. 
nN! 


it follows that 
(ee) (m + = (mod 1) 
and, on the other hand, it is clear that (ss) makes (**) a definition prescribing 


a unique value for ¢(a) which satisfies (*). So the general solution is (+**), 
with the further condition (44). An alternative way of writing (s's) is 


An + Pn 


n+1 
T See first footnote on page 447. 


Then 
(#*) (a) = —) = 
n! 
Since 
n+1 1 
(n+ 1)! n!| 
: 
pn = , m=1,2,---. 
id 
if 


478 J. v. NEUMANN [July 


EXAMPLE 2. Take the same G =G,,,, but take its normal topology T=T»> 
(distance |a—b|) and consider S=[T7)]. The question then is, for which 
does the of (**) belong to [7>|? That is, when is it It 
is obvious that this means that m!A, is bounded, and as (%s) implies that 
m'An=M+1! pit --- +(w—1)!p,-1, it means that only a finite number of 
the p,,’s are ~0. Thus 7!X, is ultimately constant, say A, and we have 


(§) = (A real). t 


EXAMPLE 3. Take the same G@=G,,,, but take its p-adic topology T=T, 
(p=2, 3,5, - - -a prime number; distance is then 2%o, where No is the minimal 
exponent V=0, +1, +2, -- - for which the least denominator of p¥(a—b) 
is not divisible by p) and consider S = [7,]. The question is, for which i,’s is 
the ¢(a) of (**) 7,-continuous? In T,, p’/n! 30 as (m=1,2,---, but 
fixed), so that exp (27\,p’i) 1, \,~’-0 (mod 1), which implies, of course, 
that there is a y=v, for which \,” is an integer. This can be expressed in 
the following manner: there is a vy for which \i” is an integer, and p, in (3% 
is divisible by the greatest divisor of +1 which is prime to p. On the other 
hand, it is not difficult to see that this condition is sufficient. 

EXxAmpPLe 4. Let G = G,,,; be the set of all real numbers with addition as the 
rule of composition. Again we first determine all a.p. functions, that is, all ir- 
reducible unitary representations. This again means solving (*), but now 
with a and 6 running over all real numbers. Equation (*) can be solved by the 
following procedure: 

Choose a rational linear basis of the set of real numbers, that is, a set B 
such that for every real number a the equation a=aiéi+ --- +ané, (n=1, 
+, rational numbers, all ~0; &, - - - , different elements 
of B) has exactly one solution [12, pp. 459-462]. For every & of B we can 
define the quantity 


am if = some Em, 
0 if ¥ each 


row ={ 


then I'(£) is always rational, we have 


a= 
tinB 
where only a finite number of terms are 40, and thus ['¢+(£) =[T©@(é) 
+TI(£). From this it follows at once that every solution ¢(a) of (*) for real 
a’s is of the form 


t Thus there exist discontinuous a.p. functions of a rational variable. This fact was proved by 
Ursell [28, Second Note]. 


ALMOST PERIODIC FUNCTIONS IN A GROUP. I 


(f) ¢(a) = IL or), 


where each ¢;(c) is a solution of (*) for rational c’s, and thus only a finite 
number of factors are #1. Conversely, it is obvious that every ¢(a) in (f) 
is a solution. Therefore the general solution is given by (f) if, for every & of 
B, we choose a ¢;(c) from (**) and ($3) with, =Az,n and dependent 
on &. 

EXAMPLE 5. Take the same © = G,eai, but consider the set of all Lebesgue- 
measurable functions, S=S,,, which is obviously a cl.f. The question is, 
which functions ¢(a) of (f) are Lebesgue-measurable? As they are solutions 
of ¢(a)¢(b) =¢(a+b) and |¢(a)| =1, we can infer from their measurability 
that they must be of the form 


(§) = (A real). 


EXAMPLE 6. Take the same © = G,ea1, but take its normal topology T=T> 
(distance |a—b|) and consider S= [To]. The question is, which functions 
¢(a) of are To-continuous? As every To-continuous function is measurable, 
all such functions must be of the form (§); and as all functions (§) are con- 
tinuous, this again gives the general solution.|| 

EXAMPLE 7. Take the same G@=@,.a1, but in it take a new topology 
T=T(Ai,--- , Ax), where the only relation mAi+--- +2,A,=0, with 
+1, +2,---, shall be m= --- =n, =0; distance is defined 


by{] 


[| |2 | _ |2]1/2 
= 2[sin? — 6) + sin? — 


Hence the condition in - - - , Ax) aS means that a,—<a with 


t This is analogous to a result of Fréchet [9] who discussed f(a) +-/(b) =f(a+-0). Cf. also Sierpinski 
[23] and Banach [1]. The simplest way to prove our statement is this: 

Put y.(a)= Then y,(a) is continuous in a and satisfies ¥.(a)¢(b) =y.(a+5). If we had 
¥<(a)=0 for every ¢, then, as (0/de)y.(a) is equal to ¢(a+e) except over a set of measure zero, it 
would lead to a function ¢(a+«)=0, except over a set of measure zero, which contradicts the con- 
dition |¢(a+e)| = 1. Thus we can find ¢9 and do such that ¥-,(a0) #0, and then our equation shows 
that that is, is continuous. Then is differentiable, so that (by our 
last equation) is also. Now we differentiate ¢(a)¢(b) =¢(a+6) and get ¢’(a)¢(b) =¢’(a+5), that 
is, ¢’(b)=6(b) when a=0. This means that ¢(2)=ae*, and our original conditions make a=1, 
d real. 

|| Thus there exist discontinuous a.p. functions of a real variable, but they are all non-measur- 
able. These facts have also been proved by Ursell (28, First Note]. 

{| For k=1 this is not only a new topology in G,,,,, but this also implies an identification of ele- 
ments congruent mod 1/). After this identification it is the normal topology. For k>1 it implies no 
identifications, but it is a new topology. 


1934] 479 

ig 

i 


480 J. v. NEUMANN [July 


respect to mod 1/\i, ---, and mod 1/A, simultaneously. Therefore G,ea is 
compact when metrically completed in this topology and every uniformly 
T(Ai, , Ax)-continuous function is a.p. (cf. Theorem 36). The question is, 
which functions ¢(a) of (7) are uniformly 7(A:, - - - , \,)-continuous? As 
++, Ax)-continuity implies T>-continuity, they must have the form 
(§), that is, ¢(a) =e?*', Now the condition a,—a with respect to mod 
1/A1, - - - ,and mod 1/d,; should imply that $(a,) (a) so that 
This is the case if and only if 

ExamPLe 8. Let & be a semi-simple Lie group.t The determination of all 
a.p. functions again means the determination of all irreducible unitary repre- 
sentations (which now of course need not be of degree 1). But such a repre- 
sentation is always continuous in the normal topology T> of G.{ Therefore 
all a.p. functions in this @ are automatically To-continuous, in contrast 
with Examples 1, 2, and 4, 6 (cf. footnotes f and || on pages 478 and 479 
respectively). Thus there is no need to discuss S = [To] separately. 

16. Examples 1-8 sufficiently illustrate the various possibilities of com- 
bining a.p. functions with topology to make further comment unnecessary. 
We shall now investigate another phenomenon. 


THEOREM 33. Let G be a group and S a cl.f. of functions in it. The following 
conditions on two elements a and b of & are equivalent: 


A. D(a; €) =D(b; ©) (that is, D,.(a; ©) =D,.(b; ©)) for every € of C for which 
the elements D,,(G) belong to S. 

B. D(a) =D(b) (that is, D,.(a)=D,.(b)) for every normal representation for 
which the elements D,. belong to S. 

C. f(a) =f(b) for every a.p. function in S. 


A is a special case of B, so that B implies A. 

As every D,,(x) is a.p. (Theorem 19 and Definition 10), B is a special case 
of C so that C implies B. 

Finally, Theorems 30 and 31 show that A implies C. 

Our three statements together prove the equivalence of A, B, and C. 


DEFINITION 15. We call two elements a and b of © which satisfy one of the 
equivalent conditions of Theorem 33 S-coherent (if S is the set of all functions, 
we abbreviate this to coherent). We denote the set of those elements which are 
S-coherent (coherent) with 1 by @* (G,). 


* That is, we are led to the Bohl-Esclangon [3] quasi-periodic functions with the basis 4, + + + , An. 
Cf. also H. Bohr [4, II, pp. 111-117]. 

+ For a detailed discussion of this notion cf. E. Cartan [7]. 

t This is a most remarkable difference between the behavior of Abelian Lie groups (cf. Example 
4) and semi-simple Lie groups. It was discovered by B. L. van der Waerden [29, p. 785]. 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 481 


THEorEM 34. G is an invariant subgroup of G, and if S=[T] for a 
topology T of &, then G is T-closed. Those elements of © which are coherent with 
a given a form the coset of Ge in © belonging to a. 


Consider the condition B in Theorem 33 (either A or C could also be used). 
If a and b belong to G,° we have D(ab) = D(a)D(b) =1, D(a-!) = D(a)-1=1, 
that is, a-! and ab belong to it; if only a belongs to Go°, we have D(b-1ab) 
= D(b)-!D(a)D(b) = D(b)-1D(b) =1, that is, b-'ab also belongs to it. If 
S=[T], every D(a) is T-continuous, each set D(a) =1 is T-closed, and so 
their common part @,° is also. That a and b are coherent means that we al- 
ways have D(a) = D(b), D(a-b) = D(a)-D(b) =1, that is, that a-'b belongs 
to Go°. Hence the elements b form exactly the coset of Go° in G belonging to a. 


DEFINITION 16. If Goo =1 (Go=1) we call G and S (G) maximally a.p.; if 
G5 =G (Go=G) we call G and S (@) minimally a.p. 


These two cases are indeed the two extremes which can occur. If @ and 
S are minimally a.p., then for every a.p. f(x) of S we always have f(a) =f(1), 
that is, the constants are the only a.p. functions in S. And for every normal 
representation D(a) with the elements D,,(a) in S, it must be D(a) = D(1) =1, 
so that if D(a) is irreducible its degree must be 1. If, on the other hand, G 
and S are maximally a.p., then there exists, for every pair a and 6 in G, 
a+b, a © from C such that all D,.(G) are in S with D(a; ©) ~D(b; ©), and 
an a.p. function f(x) in S such that f(a) ~f(5). Even more is true: 


THEOREM 35. If f(x)g(x) is in S whenever f(x) and g(x) are in S, and if © 
and S are maximally a.p., then, for any finite set a1, --- , Gn of distinct elements 
of & and any set of complex numbers ou, - -- , On, an a.p. function f(x) exists 
in S with the prescribed values f(a1) , f(@n) =n. 


If ab, there is an a.p. function g(x) in S such that g(a) ~g(b), so that 


g(x) — g(d) 


h 


is an a.p. function in S with h(a)=1 and h(b)=0. For every pair a and 
b (ab), choose such a function h(x) and denote it by h(a,b;x). Then 


Ke) = Sie, 


v=] p= 


has all the properties required. 
There are also some other ways to characterize @,°, but we shall not dis- 
cuss them here. 


| 
t 
i 
iw 


482 J. v. NEUMANN [July 


17. If G and S are maximally a.p., we can introduce a topology by means 
of their a.p. functions. In this connection the following notions are of im- 
portance: 


DEFINITION 17. If @ and S are maximally a.p. we define a topology FS 
in & by considering the following “neighborhoods” (a) of an element a of Gt: 
Choose a finite number of a.p. functions fi,---,fn and an €>0; then N(a) 
=N(a; fi,---,fn, €) is the set of all b’s such that |f,(a)—f,(b)| <e,---, 
| fn(a) —fn(b)| <e. (If S is the set of all functions we abbreviate this to F.) 


One sees at once that FS satisfies Hausdorfi’s Axioms (cf. first footnote 
on page 447). 


DEFINITION 18. If two topologies T, and Tz for a set S are given, T; is 
called weaker than T>» if every T,-neighborhood of an element a of S contains a 
T2-neighborhood of a. 


Obviously, every set which is closed or open in the 7)-sense, and every 
function which is continuous in the 7,-sense, has the same property in the 
T2-sense. Thus for a group S =G, [71] is a subset of [T.] (cf. Definition 14). 
On the other hand, it is obvious that if S,; and S; are cl.f. and S: is a subset of 
S,, then FS; is weaker than FS». 

We intend to go more deeply into the theory of [7] and FS on another 
occasion. At present let us merely remark that for every G (even for a non- 
topologically given G) F is a topology determined by © alone (if @ is maxi- 
mally a.p.). Discussion of Examples 1 and 4 shows without much difficulty 
that G,,, and @,,.; are maximally a.p. (even with their cl.f. [To] or [T,] and 
[To] or [TQu, - ++, Ax) ] respectively (for k>1, cf. footnote* on page 480)) 
and that their F’s are very “strong” ; the condition a,—a in F asn—© means 
that all a,’s, with a finite number of exceptions, are equal to a. On the other 
hand, Example 8 shows that, for a semi-simple Lie group G (if it is maximally 
a.p.), F=F[T |. Theorem 36 shows that if G is compact in To, G and [To] 
are maximally a.p. and F[T,)|=T>o. Thus, if G is a semi-simple and compact 
Lie group, it is maximally a.p. and F=T>. 

18. The case where a group G and a topology T have the properties that 
@ and [7] are maximally a.p. and F[T]=T is of particular importance. 


THEOREM 36. & and [T] are maximally a.p. and F[T|=T in each of the 
two following cases (in case B, F[T|=T should be understood to mean only that 
the condition a,—a as n—© is equivalent to it in both senses) : 


A. @ is compact in T. 


t See first footnote on page 447. 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 483 


B. is locally compact} and separable} in T, © is an Abelian group, and ab 
and a~' are T-continuous in a and b, and a respectively.t 


If G is compact in T, every continuous f(x) is a.p.: for G being compact, 
f(x) is uniformly continuous; if any sequence ai, d2,--- is given, we can 
extract from it a subsequence 4, , a,,,- +» Which converges to a limit a, and 
then we have the result that f(xa, )—/f(xa) and f(a,,x)—f(ax) uniformly as 
Thus consists only of a.p. functions. 

Now it is possible to define a distance D(a, 6) in © which is equivalent to 
the topology T [27, 25]. f(x) =D(a, x) belongs to [7] and we have the result 
that f(a) =0+/(b), proving that G and [7] are maximally a.p.; and the 
neighborhood ¥(a; f, €) (cf. Definition 17) is the sphere with the center a 
and the radius e, proving that T is weaker than F[T]. But F[T] is obviously 
weaker than 7, and therefore F[T]=T. Thus A is proved. 

The proof of B will be given at the end of Part V. 

Minimally a.p. groups likewise exist, for example, the group qn) of all 
linear transformations of determinant 1 in the real euclidean space of 
dimensions, »=2, 3,---. As it is a semi-simple Lie group, indeed even 
simple [6], all its bounded representations are continuous [29]; as it is a 
linear group, their continuity implies their differentiability [14, p. 37]. Hence 
we need only determine those irreducible representations of gn) which arise 


from “infinitesimal representations,” and see if there exist any bounded ones 
among them. Now these representations and their traces (characteristics) are 
known [70; 30, pp. 287, 300], and only the identity, D(a) =1, has a bounded 
trace, so no other representation can be bounded. Application of Theorem 33, 
criterion A, shows that g,,) is minimally a.p. 

The group g’ of all transformations y=ax+6 (a and 6 real, a>0) is 
neither minimally nor maximally a.p., as a simple discussion shows. 


V. ABELIAN GROUPS 


19. We assume throughout Part V that the assumptions of Theorem 36, 
case B (which we shall finally prove), hold; thus we assume that a group G 
and a topology T are given, that © is locally compact and separable in T and 
Abelian, and that ad, a—! are T-continuous. 

Under the above topological assumptions, A. Haar has shown the existence 
of a right-invariant Lebesgue integral [11, pp. 166-167 ]. Thus it is possible to 
define for complex-valued functions f(x) defined in G (i) a notion of measura- 

t “Locally compact” means that each element a has a conditionally compact neighborhood [13, 
p. 107]; as a group G is homogeneous it is sufficient to postulate this for the element 1. “Separable” 
means that there exists a countably infinite “equivalent system of neighborhoods”; if the topology 
is originated by a distance notion, one may postulate the existence of a countably infinite everywhere 


dense subset [13, p. 125, and p. 229, Axiom 10]. 
t See first footnote on page 447. 


7 

x 

ig 

+ 


484 J. v. NEUMANN [July 


bility, (ii) a notion of summability, (iii) an integral Jy f(x)dx. On the basis of 
(i) and (ii), moreover, it is possible to do this in such a manner that (i)—(iii) 
have all the formal properties of these notions as in the usual Lebesgue 
theory, and besides are invariant under the substitution of f(xa) for f(x). 

We now consider all measurable functions f(x) in © for which |f(x)|? is 
summable, that is, /@| f(x) | °dx is finite. These functions form a Hilbert space 
$q if we define the inner product (f, g) to be fyf(x)g(x)dxt, provided that © 
is infinite, which we will assume to be the case. (If it is finite, it is compact and 
falls under case A.) In $e, 


(#) Oaf(x) = f(xa) 


defines a linear and unitary operation (that is, an operation which leaves 
(f, g) invariant), and it follows that 


(# #) 0.0» 


Now we use the Abelian character of @, by virtue of which (##) implies 
that O, and O, commute. As O, is unitary, its adjointf is O.*=O,-! and, by 
(##),O0.* =O,—1. Thus every O, commutes with every O, and O,*, and the set 
of all operators O, has been called Abelian [16, p. 389]. Therefore a theorem 
proved by the author applies to this set: there exists a bounded Hermitian 
operator R such that every O, is a function of R, 


(fy) 0. ¢.(R), 


where ¢,(A) is a complex-valued function of the variable \.§ That the func- 
tions ¢.(A) can be used for the discussion of the group @ has been noted by 
Haar and successfully applied to countably infinite Abelian groups [10, p. 
131]; cf. also Wiener and Paley [33]. Theorem 37 will be an application of 
this idea in the full generality allowed by Haar’s right-invariant Lebesgue 
integral. It must be remarked, however, that Haar’s method of discussing 
countably infinite Abelian groups has been considerably simplified by 
Wiener and Paley [33], but that their simplification seems not to apply to 
our general case, and that we have to use Haar’s original method. 


THEOREM 37. If G and T fulfill the assumptions formulated at the beginning 
of this part (that is, if © is locally compact, separable, and Abelian), there exists 
a function in two variables $(a, d) (a in G, d real) with the following properties: 


t For the modern theory of Hilbert space cf. J. v. Neumann [15, pp. 63-70, 108-111]. Cf. further 
M. H. Stone [24, pp. 1-32]. 

§ The notion of a function of an operator is due originally to F. Riesz. More general forms have 
been given to it by J. v. Neumann [17, pp. 202-213] and M. H. Stone [24, pp. 221-241]. The theorem 
in question has been proved by J. v. Neumann [17, p. 214]. 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 485 


¢(a, d) is a Baire function in (a, d),* and there exists a “resolution of the iden- 
tity” E(d) such that 


identically in a and f(x) and g(x). 

For the function ¢(a, A) =¢.(A) in (h), the theorem mentioned in the 
footnote§ on page 484 leads to all our statements except for the Baire charac- 
ter of (a, A) in (a, \) (it would show the Baire character in A, but we need 
it also in a). 

We know that finite linear aggregates of functions fo where © is a con- 
ditionally compact open set, therefore having finite measure, and 


forain 


0 elsewhere, 
are everywhere dense in our functional space [15, p. 110]. From now on 
“everywhere dense” will be interpreted in the sense of the distance 


/2 


8) = al = f- = | — 


but not in the sense of the distance 1.u.b..| f(x) —g(x)|. Now, if O. is an open 
set the closure of which is part of O, we can find a continuous function§ 


= 1 forainO, 
f0,0,(@); = 0 for a not in, 
= Oand < 1 elsewhere. 


If we let O1 converge to O, then fp © (a) converges everywhere to, and is 
majorized by, fo(a), so that fp(a) is its limit in the sense of the distance 
||f—g||. Therefore continuous functions f which are 0 only in conditionally 
compact sets are everywhere dense in our functional space. Since a¢,—<@ im- 
plies xa,—xa, we have f(xa,)—>f(xa) for these functions, and, by the second 


property, 


12 


— Oafll = | fi f(xan) — f(xa) 0. 


Hence a,—a implies O.,f—O,f for an everywhere dense set of f’s, but as all 


* That is, it can be obtained from continuous functions in (a, A) by successive limiting processes 
wherein the limit is always taken of everywhere convergent sequences. 

Tt /_, is a Lebesgue-Stieltjes integral over \. For an explanation of the terminology used, see [15, 
p. 92] or (24, p. 174]. 

§ This is a problem of Fréchet, first solved by Hahn. Cf. (26, Anhang III, p. 290]. 


796 
13 
td 


486 J. v. NEUMANN [July 


O,’s are unitary operators, and therefore uniformly continuous in /, the im- 
plication holds for every f. Consequently O,f is a continuous function in a for 
every fixed f. 

A simple computation shows, after substituting E(u)g in ({), at the end 
of Theorem 37 in place of g[cf. 17, p. 206], 


(0.f, E(u)g) = f o(a, g). 


Now choose a complete normalized orthogonal system fi, fe, - - - , putf=g=fn, 
n=1, 2,---, multiply by 2-*, and add. The infinite series thus obtained 
in the left- and right-hand members converge uniformly since (O.f, E(u)g) 
and (E(A)f, g) are both <||f|| ||z|| in absolute value. The result is 


n=1 n=1 
(Ocfn, E(u)f,) is continuous in a and continuous on the right in p, (E(A)f,, fn) 
is continuous on the right in \ and monotonically increasing, and the same 
properties hold for the uniformly convergent sums 


F(a, w) = 2-*(Oafa, Elu)fn), GA) = 2-*(E(A) fn; 


n=1 
Thus F(a, u) and G(A) are Baire functions, the latter is monotonically increas- 
ing, and 


F(a, = dG). 
If we consider G(A) as the variable (instead of \), then the well known 
theorem on the differentiability of integrals shows that 


F(a, u + 6) — F(a, — €) 
+ 6) — Gu — ©) 


exists and equals ¢(a, u) except, however, for a set of u’s dependent on a 
whose §=G()-image§ is a set (of real numbers) of Lebesgue measure zero. 
Now the function 
F(a,u+ 6) — F(a, — ©) 
im 
= fact +8) — ©) 
0 otherwise 


when this limit exists, 


¢ Cf. [5, pp. 544-545]. Analogous results concerning “central derivatives” of F with respect to 
G are due to Daniell [8]. 

§ If G(u) is discontinuous at u1=,o, the image of «=o is supposed to be the whole jump-interval 
G(uo—0) SESG(uo+0). 


—— 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 487 


is obviously a Baire function, and the set 2 of \’s for which ¢:(a, A) ¢(a, d) 
has a £=G(u)-image of Lebesgue measure zero. Since 2*G(u) —(E(u)fn, fn) 
is monotonically increasing, the §=(E(u)f,, f,)-image of 2 is also of 
Lebesgue measuref zero, and therefore every §=(E(u)f, f)-image is of Le- 
besgue measure zero [17, p. 213, the last remark of Part II]. 

Hence we may replace ¢(a, d) in (f) by ¢:(a, A), and this will not affect 
the validity of ({) for f=g; now if we replace our f by (f+g)/2 and (f—g)/2 
and subtract, we get the real part of the general ({); if we replace f and g by 
if and g, we get its imaginary part, and prove it altogether. Thus ¢,(a, d) 
meets all our requirements. 


THEOREM 38. Under the assumptions of Theorem 37, $(a, d) can even be 
chosen as a continuous function in a satisfying the equations 
= d) o(6,), | »)| = 1. 
A simple computation shows [17, p. 206] that 


+00 


(OOof, g) = (a, d) o(6, A) 8), 


+00 
(020.f, g) = f | ) | 8); 


on the other hand, 


+00 +0 


Now 0.0; =0., O.*O. = 1; therefore the right sides of our equations are equal. 
An analogous computation shows that if we substitute E(u)g for g [17, p. 206] 
and subtract, we get 


— 6(@, ») 90, 9) = 0, 


"dee, r) |? — 1) d(E()f, g) = 0. 


Putting f=g shows that the equations of our Theorem hold except for a set 
of \’s (depending on the pair a and 6 and on a respectively), the §=(E(A)f, f)- 
image of which has Lebesgue measure zerot (this condition holds for all 
f’s simultaneously). Returning to the complete normalized orthogonal 

t The Lebesgue measure of the =H (u)-image of a setS is /EdH(u), and therefore, if it isO for 


H(u), it will be 0 for every other function K(u) for which H(u)— K(x) is monotonically increasing 
[cf. 17, p. 198, rule d, and p. 199]. 


fy 
wey 
i 
4% 
ay 
on 
ie 


488 J. v. NEUMANN 
system fi, fz, - - - in the proof of Theorem 37, and to the corresponding 


= 2-*(E(A) fay fn); 
n=1 
we see that also the £=G(A)-images have Lebesgue measure zero (this follows 
for each >-”_,2-*(E(A)fa, fa) from the integral formula in the footnote on 
page 487, and for 


G(A) = (EQ)fn, fn) 
from the fact that the difference is monotonically increasing, 20, and 
< = 1/24. 

For a-sets and b-sets (that is, subsets of G) we have a measure, namely, 
the Haar-Lebesgue measure. For )-sets (that is, sets of real numbers) we shall 
consider the Lebesgue measure of the £=G(A)-image (cf. the footnote§ on 
page 486), and call it the \-measure. All these measures have the formal 
properties of the Lebesgue measure.* We can, by the analogue of the process 
which leads from linear to plane measure [cf. 18, p. 588] use these measures 
to define measures with similar properties for (a, b)-sets, (a, d)-sets, and 
(a, b, X)-sets. If we use these defining processes, the theorem of Fubini holds 
for all combinations of the variables a, 6, \ because its proof [5, pp. 622-628 ] 
applies unchanged. 

As we are dealing with Baire functions, the (a, 6, \)- and (a, d)-sets for 
which $(ab, \) A) and | (a, \)| ¥1 are Borel sets and therefore 
measurable. Hence Fubini’s theorem can be applied to them; as for fixed 
a, b and a respectively they give \-sets of zero \-measure, they are sets of 
zero (a, b, \)-measure and (a, \)-measure themselves. This again implies that 
if \ does not belong to a certain (fixed) set S; of zero \-measure, and if a does 
not belong to a certain set S,® (depending on A) of zero (Haar) measure, 
then we have the result that, if 6 does not belong to a certain set ©;%- (de- 
pending on \ and a) of zero (Haar) measure, then ¢(ab, \) =¢(a, d)o(}, d), 
and, at any rate, |¢(a, \)| =1. 

Now choose a conditionally compact open set ©. If we had, for a certain X, 


f o(x, A)dx = 0 
D 


for every, this would imply that 


f o(x, A)dx = 0 
M 


* By this we mean that they satisfy Carathéodory’s postulates I-V [5, pp. 238-239, 258]. 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 489 


for every measurable set I, and thus ¢(x, A) =O except for an x-set of )- 
measure zero. This contradicts the fact that we have |@(x, \)| =1 except for 
an x-set of \-measure zero. Therefore choose © such that 


f (x, A)dx ¥ 0, 
D 
and denote the set of all (ax)’s (x in O) by aO. 


Assume in ©; and a in Then we have =d(a, d) if 
¢ is not in S;*- and this implies that 


$(a, fio dx foes dx Js dy 


A)dx 
Jools, A)dx 


Now, by well known theorems on Lebesgue integrals, the numerator is con- 
tinuous in a, the denominator is constant and ~0, and for this argument we 
need not restrict a to S,™ (A, of course, is in S:). Hence we may define a 
continuous function ¢2(a, \) by putting it equal to 


S 


(A in S:); and we have ¢2(a, A) =¢(a, A) if a is not in S,™. 

In this case, if 6 is not in ©, and ab is not in SG, we have ¢2(ab, d) 
=¢2(a, \)do(b, A). But as we except only a b-set of zero (Haar) measure, this 
holds in an everywhere dense b-set, and thus, for reasons of continuity, for 
every b. So the above formula is true for every 5, and |¢2(a, \)| =1 is true if 
a is not in ©. For the same continuity reasons, therefore, there are no a- 
exceptions at all. Thus (if \ is not in G,) ¢a(a, A) meets the requirements of 
our theorem if it can replace (a, X). 

By definition, ¢2(a, \) is a Baire function in (a, \). Hence the (a, d)-set 
for which $(a, \) ~¢2(a, ) is a Borel set and therefore measurable. Hence 
Fubini’s theorem applies to it, and since for a fixed \, except in a set of zero 
d-measure (©), it gives an a-set of zero (Haar) measure (S,), it is a set of 
zero (a, \)-measure itself. This again implies that if a does not belong to a 
certain (fixed) set Sj of zero (Haar) measure, then $(a, \) =@2(a, \) provided 
d does not belong to a certain set S,’@ (depending on a) of zero \-measure. 
If we change ¢2(a, A) for the \’s of S; into 1, we obtain its continuity in @ 
and the equations of our Theorem for all \’s without exceptions, and the state- 


¢(a, = 


> 

ay 

uy 

74 

rd 


490 J. v. NEUMANN [July 


ment just now proved still holds if we replace S,’@ by S.’-+©, which is 
also a set of zero \-measure. 

If a does not belong to Si, we have (a, A) =¢2(a, A) except for a d-set 
with zero \-measure, that is, with a £=G(A)-image (cf. the footnote§ on page 
486) of zero Lebesgue measure. This proves, as at the end of the proof of 
Theorem 37, that 


+00 


(nf, g) = d) d(E(A)f, g) 
identically in f(x) and g(x) if @ does not belong to Gj. Since S/ has zero 
(Haar) measure, the domain of validity in a is everywhere dense. But both 
sides are continuous functions of a: this was shown for the left side at the 
beginning of the proof of Theorem 37, and follows for the right side from the 
continuity of ¢2(a, ) in @ for all \’s. Hence our equation holds for all a’s. 
Thus ¢2(a, A) meets all our requirements. 


THEOREM 39. If the assumptions of Theorems 37 and 38 are satisfied, and 
if ab and a- are continuous in (a, b) and in a respectively, then the condition 
(a, \) =$(b, A) for all d’s is equivalent to the condition a=b, and the condition 
(an, A)—>o(a, A) as n—> for all d’s is equivalent to the condition a,—a as 


The first statement follows from the second by putting a:=a,.= --- =b. 
The necessity of the criterion in the second statement is obvious, as all the 
functions ¢(a, ) are continuous in a. So the only thing we need to prove is 
its sufficiency. 

Therefore suppose that ¢(a,, 4) as for all d’s. Then (f) 
in Theorem 37 shows that (0.,f, g)—(O.f, g) as n—© for any f(x) and g(x) 
of our functional space. Now let © be a conditionally compact open set, and 
define 
1 for x in OD, 


0 elsewhere. 


foo = | 
Put f(x) =fo(x) and g(x) =fo(ax). Then 


8) = J 


(cf, 8) = J, flax) |*dx > 0, 


and (0.,f,g)—>(O.f, g) implies that if m is sufficiently large, (O.,f, g) #0 and 
therefore f(a,x)f(ax) #0. Hence there is an x for which a,x and ax both belong 


‘ 


1934] ALMOST PERIODIC FUNCTIONS IN A GROUP. I 491 


to D, and as a,a—!= (a,x) (ax)—!, aa,~! can be written in the form uv-!, where 
u and v both belong to ©. As every open set has conditionally compact sub- 
sets, this holds for every open ©. 

Now let 9 be a neighborhood of a. Then we can find an open set © for 
which every uv—!, u and v in ©, belongs to Nt. (Here is where the extra con- 
tinuity assumptions are used.) Then our result shows that if » is sufficiently 
large, a, belongs to Jt. This means that a,—-a asn—-0. 

Theorems 37 and 38, combined with Theorem 19, show that each ¢(a, d), 
when considered as an a-function, is a.p. and belongs to [7]. Therefore 
Theorem 39 proves exactly the statements of Theorem 36, case B. 


BIBLIOGRAPHY 


1. S. Banach, Sur l’équation fonctionnelle {(x+-y) =f(x)+f(y), Fundamenta Mathematicae, vol. 
1 (1920), pp. 123-124. 
2. S. Bochner, Beitriége zur Theorie der fastperiodischen Funktionen, Mathematische Annalen, 
vol. 96 (1927), pp. 119-147. 
3. P. Bohl, Uber die Darstellung von Funktionen einer Variabeln durch trigonometrische Reihen 
mit mehreren einer Variabeln proportionalen Argumenten, Thesis, Dorpat, 1893. 
4. H. Bohr, Zur Theorie der fastperiodischen Funktionen. 1, Acta Mathematica, vol. 45 (1925), 
pp. 29-127. II, Ibidem, vol. 46 (1925), pp. 101-214. 
5. C. Carathéodory, Vorlesungen iiber reelle Funktionen. 2d edition. Leipzig and Berlin, 1927. 
6. E. Cartan, Sur la structure des groupes de transformations finis et continus, Thesis, Paris, 1894. 
7. E. Cartan, Les groupes projectifs qui ne laissent invariante aucune multiplicité plane, Bulletin 
de la Société Mathématique de France, vol. 41 (1913), pp. 53-96. 
8. P. J. Daniell, Differentiation with respect to a function of bounded variation, Transactions of 
the American Mathematical Society, vol. 19 (1918), pp. 353-362. 
9. M. Fréchet, Pri la funkcie ekvacio f(x+-y) =f(x)+f(y), L’Enseignement Mathématique, vol. 
15 (1913), pp. 390-393. 
10. A. Haar, Uber unendliche kommutative Gruppen, Mathematische Zeitschrift, vol. 33 (1931), 
pp. 129-159. 
11. A. Haar, Der Massbegriff in der Theorie der kontinuirlichen Gruppen, Annals of Mathematics, 
(2), vol. 34 (1933), pp. 147-169. 
12. G. Hamel, Eine Basis aller Zahlen und die unstetigen Lisungen der Funktionalgleichung 
I(x+y) =f(x)+f(y), Mathematische Annalen, vol. 60 (1905), pp. 459-462. 
13. F. Hausdorff, Mengenlehre. 2d edition. Berlin and Leipzig, 1927. 
14. J. v. Neumann, Uber die analylischen Eigenschaften von Gruppen linearer Transformationen 
und ihrer Darstellungen, Mathematische Zeitschrift, vol. 30 (1929), pp. 3-42. 
15. J. v. Neumann, Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren, Mathe- 
matische Annalen, vol. 102 (1930), pp. 49-131. 
16. J. v. Neumann. Zur Algebra der Funktionaloperationen und Theorie der normalen Operatoren, 
Mathematische Annalen, vol. 102 (1930), pp. 370-427. 
17. J. v. Neumann, Uber Funktionen von Funktionaloperatoren, Annals of Mathematics, (2), vol. 
32 (1931), pp. 191-226. 
18. J. v. Neumann, Zur Operatorenmethode in der klassischen Mechanik, Annals of Mathematics, 
(2), vol. 33 (1932), pp. 587-642. 
19. J. v. Neumann, Zum Haarschen Mass in topologischen Gruppen, Compositio Mathematica, 
vol. 1 (1934), pp. 106-114. 


> 

4 

q 


492 J. v. NEUMANN 


20. E. Schmidt, Zur Theorie der linearen und nichilinearen Integralgleichungen, Mathematische 
Annalen, vol. 63 (1907), pp. 433-476. 

21. I. Schur, Neue Begriindung der Theorie der Gruppencharaktere, Sitzungsberichte der Preus- 
3ischen Akademie, Phys. Math. K1., 1905, pp. 406-432. 

22. I. Schur, Newe Anwendungen der Integralrechnung auf Probleme der Invariantentheorie, 
Sitzungsberichte der Preussischen Akademie, Phys. Math. Kl., 1924, pp. 183-208. 

23. W. Sierpinski, Sur l’équation fonctionnelle f(x+-y)=f(x)+(y), Fundamenta Mathematicae, 
vol. 1 (1920), pp. 125-129. 

24. M. H. Stone, Linear Transformations in Hilbert Space, American Mathematical Society 
Colloquium Publications, vol. XV, 1932. 

25. A. Tichonoff, Uber einen Metrisationssatz von P. Urysohn, Mathematische Annalen, vol. 95 
(1926), pp. 139-142. 

26. P. Urysohn, Uber die Machtigheit der su hiingenden Mengen, Mathematische Annalen, 
vol. 94 (1925), pp. 262-308. 

27. P. Urysohn, Zum Metrisations problem, Mathematische Annalen, vol. 94 (1925), pp. 309-315. 

28. H. D. Ursell, Normality of almost periodic functions, First Note, Journal of the London 
Mathematical Society, vol. 4 (1929), pp. 123-127. Second Note, Ibidem, vol. 5 (1930), pp. 47-50. 

29. B. L. van der Waerden, Stetigheitssdtze der halbeinfachen Lieschen Gruppen, Mathematische 
Zeitschrift, vol. 36 (1933), pp. 780-786. 

30. H. Weyl, Theorie der Darstellung kontinuirlicher halbeinfacher Gruppen durch lineare Trans- 
formationen. I, Mathematische Zeitschrift, vol. 23 (1925), pp. 271-309. 

31. H. Weyl, Integralgleichungen und fastperiodische Funktionen, Mathematische Annalen, vol. 
97 (1927), pp. 338-356. 

32. H. Weyl and F. Peter, Die Vollstindigkeit der primitiven Darstellungen einer geschlossenen 
kontinuirlichen Gruppe, Mathematische Annalen, vol. 97 (1927), pp. 737-755. 

33. N. Wiener and R. E. A. C. Paley, Characters of Abelian groups, Proceedings of the National 
Academy of Sciences, vol. 19 (1933), pp. 253-257. 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


= 


WARING’S PROBLEM FOR CUBIC FUNCTIONS* 


BY 
G. CUTHBERT WEBBER 


1. Introduction. L. E. Dickson} has proved that all integers sufficiently 
large are sums of nine values of f(x) =x+«(x*—x)/6, where ¢ is prime to 3. 
In §6 of this paper the author considers the above function f(x) with e=3a. 
For a=0 or 1 (mod 3) he obtains the same result as Dickson obtained for e 
prime to 3. However, for az=2 (mod 3) it is proved that every integer suffi- 
ciently large is expressible as a sum of ten values of f(x). 

Certain classes of cubic functions with the square term present are treated 
in §§2-5, inclusive, the results being stated in Theorems 1, 2, and 3. These 
results are analogous to those stated for polynomials without square term. 

In the same paper Dickson showed that all positive integers are sums of 
nine values of f(x) = (x*+2x)/3 and stated the possibility of such a theorem 
for f(x) =(x*+5x)/6. Miss Frances Bakerf proved a universal theorem for 
representation of weight nine by f(x) = (x*+)/2. The only cubic functions of 
the form f(x) =x+¢(«*—x)/6 for which it is possible to obtain a universal 
theorem giving representation of weight nine are those for which ¢ takes one 
of the values 1, - - - , 6. The author proves in §7 that every integer may be 
represented as a sum of fifteen values of f(x) =x*+3(«?—x) for values 20 of x. 
Since 41 requires fifteen values this is the best theorem obtainable. 

2. Determination of all functions (1) having certain properties. We con- 
sider cubic functions of the form 

ax® + box? + cx 


(1) f(x) = » a>0, ¥0, 


where a, bo, c and d are integers having no common divisor greater than 1. 
Further, in order that a true Waring’s Problem be considered, it is stipulated 
that the coefficients of f(x) must satisfy the following conditions: 

(a) that the values of f(x) be positive integers for all integral values =0 
of x, 

(b) that the function have the value 1 for some integral value £20 of x. 

The quantities f(1), f(2) and f(3) will be integral if d divides each of 

* Presented to the Society, April 6, 1934; received by the editors February 5, 1934. 

t Waring’s problem for cubic functions, these Transactions, vol. 36 (1934), pp. 1-12. 


t A Contribution to the Waring Problem for Cubic Functions, Doctoral Dissertation, University 
of Chicago, 1934. 


493 


14 

‘4 

‘ 

if 


494 G. CUTHBERT WEBBER [July 


a+bo+c, 8a+4b9+2c, and 27a+9b,+3c. Eliminate c and by from these three 
expressions; the result of this process is that d divides 6a. Consequently d 
divides each of 2b, and 6c. If d has a prime factor p >3, p divides each of a, bo, 
and ¢ contrary to hypothesis. Thus the only prime factors of d are 2 and 3. 
Since f() =1, d must satisfy 


(2) d = at* + bot? + cé. 


Case I, d=6w. Since w divides each of a, by and c, then w=1. From (2), 
£ is a positive divisor of 6. Since d =6 divides 2b, let bp = 3b. 
I,, £=1. Thus, from (2), c=6—a—3b, and 


f(x) = (x x) (x x) +x. 


If <0, write b= —b,. Then necessary and sufficient conditions that f(x) 
satisfy property (a) above are b>0, or if b<0 then 0<b,<a+2 if a23, and 
0<b,s (4a+4)/3 if a=1 or 2. 

I,, £=2. Thus c=3—4a—60 and 


b ‘ 5 


In order that f(x) be integral a and 6 must have different parity. From 
f(1) 20, requires Let b= thus b:2a—1. Also, f(3) 20, 
f(4)20 and f(5)=0 require b:Sy, where »=(4a+1)/2 if a=1 and 7 
=(5a+3)/3 if a>1. Accordingly, the conditions and a—)b,=1 
(mod 2) are necessary and sufficient that 


_ bh, b 
fle) = 2) — a) + — a 


satisfy conditions (a) and (b). 

I;, £=3. With c=2—9a—9b the conditions f(1) 20, f(2) 20, and f(4) 20 
require that b= >0 and The function f(x) 
becomes 


Kis) = +A — 4a + 3d;)x. 


Further, a=1 (mod 3). 
I,, £=6. Conditions (a) and (b) require 


a by 1 
f(x) = 3" — x) — x) + — 35a + 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 495 
where a and }; are such that a=3b,—1 (mod 6) and (20a—1)/6<), 
<(13¢+1)/3. 
Case II, d=3w, w odd. As in I, w=1 and b) = 30. From (2), £=1 or 3. 
€=1. With c=3—a—3b, 


Condition (a) requires that b=(—3-—8a)/6. 
II,, £=3. Since c=1—9a—98, 


f(x) = se — x) — d(x? — x) 
1 
+ — 8a + 6b;)x. 

Condition (a) requires that a=2 (mod 3) and that (5¢—1)/3<b,<(7a+1)/3. 

Case III, d=2w, (w, 3) =1. According to hypothesis w=1; from (2) &=1 
or 2. 

IIl,, Then 

a bo 
f(x) = + —«x)+x, 


where => —2—3a. 
Then 


Sls) = — 3a — bo)x, 


where a+,=1 (mod 2), b) <0 and 3a—1S S5a+1. 
Case IV, d=1. From (2), £=1. This requires 


f(x) = a(a® — x) + bo(x? — x) + x, 
where => —1—3a. 
Consider 


(3) fis) = 2) 2) p>0, 9X0, 


where #, g and ~ are integers satisfying necessary and sufficient conditions, 
as stated in Cases I-IV, that f(x) be integral and 20 for all integral values 
20 of x. The substitution x = X +/, with g= —/#, transforms (3) into 


(4) f(x) = F(X) +, 


3 


G. CUTHBERT WEBBER 


F(X) = on — X) + gX, 


1 
a= ut+ (3pt? — — pt) (an integer), 


1 
g=u 5 ) 


In §§3, 4, and 5 the only values of » and g considered are those for which 
the above transformation is possible, i.e., values such that g= —?p, ¢ an in- 
teger. 

3. The functions (5) for (p, 3) =1. The functions to be studied in this sec- 
tion are 


(5) F(X) = ri — X)+gX (p > 0, (p, 3) = 1, p and gq integers). 


The investigation is entirely analogous to that of L. E. Dickson in his Trans- 
actions paper mentioned heretofore except that the inequalities require 
X =|t|. This ensures that x=0. 

We prove 

THEOREM 1. Let the triple of integers p, q, u be given satisfying the conditions 
stated at the end of §2, (p, 3) =1, and let a be defined as under (4). Then there 
exist integers C and v such that every integer =C-3*+9a is a sum of nine 
values of (3) for integral values =0 of x. 

Let |¢| <3*, 6 being an integer =0. 

The following three lemmas are necessary. 

Lemma 1. Let the integers t and 5 be defined as above. Corresponding to any 
positive integer s there exists an integer m' such that s is congruent to F(3m’) 
modulo 3, where |t| 

Define A by 


A = F(z + 3r) — F(z) = Sarat + 9r?z + Or? — r) + 3gr. 


It may be proved by induction that A 40 (mod 3*) if and only if r40 (mod 
3"). Let m’’ be an arbitrary integer such that 0<m’’ <3” and let k’ be an 
integer such that 0<k’ <3", k’<m’’. Then, if m’ and k are defined by 
m’ =m"' +3*- and k=k’ +3*-!, we obtain m’ —k=m’’—k’ 40 (mod 3"). Use 
this value m’—k as anrin A. Then 


F(3m') — F(3k) = F{3k + 3(m' — k)} — F(3k) #0 (mod 3"). 


496 [July 
where 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 497 


Since m’’ ranges over a complete residue system modulo 3*, m’ does like- 
wise, hence the same is true of F(3m’). From 3m’ =3m'’ +3? it follows that 
3°<3m’ <3*t!+3°. This proves the lemma. 

Lemma 2 is taken directly from the Dickson Transactions paper. 


Lemma 2. If » is an odd constant integer, v(n—v) is even and can be made 
congruent to any assigned even integer modulo 2* by choice of an integer v. 


Lemna 3. If n=>max (3, 5), gS13p+1, and 3m<3"+'p+3!, then F(3m) 
<3'"y, where 


(6) (0p + 9p? + 4p +1). 


Since X*—X is monotone increasing, 


F(3m) = — 3m) + — m) + 3(13p + 1)m 


< Q. + Q. + Q. 336-3 3"p 3*) 
+ (13p + 1)(3"*"p + 3°) 
< (9p? + 9p? + 4p + 1) = 
According to Lemma 1 every integer s may be written as s=F(3m’) 
+3"M’, where M’ is an integer. Substituting z=3m’ and r=3"y in A, and 
writing A=3"E, we obtain 


(7) E = (p/2)(3y2? + 9-3y*s + 9-3?ny® — y) + 3gy; 
also, with m=m’+3"y, 

F(3m) — F(3m’) = F(z + 3r) — F(z) = A = 3E. 
Thus 
(8) s = F(3m) + 3°M, 


where M =M’—E is an integer. Later we will choose y such that 0<y<p. 
With these values of y and m’ the upper bound used in Lemma 3 is obtained, 
|¢| <3m 

Consideration of the values of ~, g, and wu of the different sub-cases in 
I and II of §2 shows that 13p+1 is the upper bound* of g; this was used in 
Lemma 3. For example, in Case Ii, 

* If universal theorems are desired it is advantageous to lower this upper bound, this being pos- 


sible when a particular function is being considered. The upper bound of F(3m,) may usually be 
lowered by consideration apart from the general theory. 


if 
ay 
4 
és 
4 


G. CUTHBERT WEBBER 
— 35p — 18¢) — 59 139 +1 


since 1—35p<0 and —3q¢13p+1. 
The integer s lies in some interval 


C-3" < 5 < 
and thus in one of the sub-intervals 
38" < 5, < (i = 1, 2, 3). 
Since f(x) 20, F(X)2-—a, and thus —a<F(3m,) From 3*-!C-3% 
< F(3m;)+3"M; <3'C- 3" and the last inequality we obtain 
(9) (3+1C — < M; S 3C-3% + =" 


Six functional values to be used in the representation of s have the sum 


3 

(10) T; = F(3" — Xj) + F(3" + = (38 + 3°Q; — 3") + 68-3", 
j=1 

where Q;=).j-:X?. The two remaining values to be used are given by 


(11) o = + F(w;) = — vw; + w? — 1) + 


Let 0;+w,; =3b,;3", where b; is an odd positive integer. Then ¢;=3"B,, where 


(12) B; = 3; — 30;(3b;-3" — — 1} + ‘|. 


A necessary and sufficient condition that there exist values of X of the 
forms 3m;, 3"—X;, 3"+X;, v; and w; (j=1, 2, 3) for which s; is expressible 
as the sum of nine values of (5) is that 


= F(3m;) + 3°M; = F(3m,) + + Ti, 
or 
3°M; = + Ti = + p(3** + 3°Q; — 3") + 6g-3", 
PQ: = M; — B; — p(3*" — 1) — 6g. 
The value of Q; as defined in (13) will be shown later to be integral. 
We proceed to introduce inequalities which will enable us to choose the 


desired constants b; and which will ensure that the arguments of F(X) are 
> 


(13) 


498 [July 


1934) WARING’S PROBLEM FOR CUBIC FUNCTIONS 


Choose 2; and Q; such that 
(14) 3? S —3'*, O50; S 


The first inequality of (14) implies that 3’<w;<3b;-3"—3* and thus that 
v;2|t|, The second inequality requires that X¥;<3"—; thus 3"—X; 
if 

Let V; =0;—3b,;-3"/2. From (13), Q;20 if V2 SA;, where 


M; — — 1) — 6g 3 1 
15 A;= — — -3" =; 
(15) 3b; 4 


also 0; if V2? >G;, where G; = 
The inequalities (14) will be satisfied if the following are satisfied: 


Ait > 
(16) 


This gives the range on 2, viz., 


1/2 


3 
(17) G; + 


The two inequalities G;=>0 and A,/? <(3/2)b;-3"—3* together with (17) 
are sufficient that (16) be satisfied. Accordingly G;2=0 if 


3 1 
M;= 4 + 6g + + — 1) =k, 


and A; (3b;-3"/2—3°)? if 
1 
M; {( 3023" — 4 3% — + + 6g + — 1) = Lj. 


The relation will be satisfied if 1; <the lower bound in (9) and 
L;=the upper bound in (9). From these last inequalities we obtain 
L; a 


l; 
(18) (¢ = 1, 2, 3). 


When 1 is sufficiently large, i.e., »—6 is sufficiently large, certain terms of 
(18) having a power of 3 in the denominator are negligibly small. The con- 
stants b,, be, bs and C are determined so as to satisfy (18) with these terms 
omitted; that is, 


499 
3 
j Vi, As S — 3°. 
2 
. 


500 G. CUTHBERT WEBBER 


9 10 3 


Then for n2m, say, these same constants will satisfy (18). Write (19) in 
the form J; <3,*"C <S;. 

The method used here for the choice of };, be, bs and C differs somewhat 
from that used by Dickson. For p=1 and p=2 the following choice of these 
constants satisfies (19): 

C 
1/5 7 11 168 
13 19 1760 


Case p=1 (mod 3), p=3e+1, e an integer =>0. Take ti, be and 3; as linear 
combinations of e. For the coefficient of e in b, choose the least even integer 
for which J, SS, as far as the coefficient of e* is concerned* and for the con- 
stant term the value of b; displayed in the above tablette. The coefficient of e 
in b, is taken as the least even integer for which the coefficient of e* in S;,/3 
is => that in J,, the constant term being chosen as before. Similarly, choose 
bs such that the coefficient of e* in S;/9 is = the maximum of the coefficients 
of e* in J, and J,/3. Take C to be the quartic polynomial in e whose coeffi- 
cients are integers not less than the corresponding coefficients in J;/9 and 
differing from them by at most a quantity less than unity. The following con- 
stants satisfy (19): 


b = 8e +5, be = 12e+7, 53 = 18e + 11, 
C = 2228e* + 4806e* + 3830e? + 1329e + 168. 


Case p=2 (mod 3), p=3e+2, e an integer =0. Apply the method outlined 
above subject to the explanation given in the footnote. Choose 


b = 10e+9, bo = 14e¢+ 13, 53 = 20e + 19, 
C = 3740e* + 12,456e8 + 15,510e? + 8551e + 1760. 


The coefficients in C were chosen as above from the corresponding coefficients 
in J,. This choice satisfies (19). 

We prove that Q; is an integer. From (13), M@;s==M,’ -E=B;+6g (mod 
p), and from (7), E=3gy (mod p). The definition of g, the respective values 
of u in Cases I and II of §2, and g=0 (mod ) give g=u (mod ), (u, p) =1, 
and thus (g, p) =1. Accordingly, y may be chosen such that E is congruent 

* In some cases, with this choice of the coefficient of e in 5, it is not possible to choose the coeffi- 
cient of ein C to satisfy J; SCS, and J253CZ Sz». In these cases take the next even integer as this 
coefficient. 


(July 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 501 


to any assigned integer modulo #, and thus such that E= M,’—B;—6g (mod 
p). Hence Q;, is an integer. 
The range D; of values of v;, from (17), is 


1/2 
Ay Mi 


1+ (1 — 


1/2 32n-1 2 322-1 


> 


where 
2-32n-1 


From (15) and (9) 


3bip 4 


Therefore 
2n-1 


D;> - = 
2-3%-C 2 3 1/2 
6; | — +1 
3bip 3); 4 


which for n=me, say, exceeds 8. 

The quantity Q; is representable as the sum of three integral squares. For, 
from (13), 2p0;=2M;—12g—2B; (mod 8). Take 36;-3"=7 in Lemma 2 and 
choose 2; modulo 8 such that v;(3b;-3"—v,;)=2¢; (mod 8), where ¢; is an arbi- 
trary integer. Thus, from (12), 


2B; 3pb;-0;(3b;- 3" + 6g); — 6g); (mod 8p). 


By choice of y we made M;=6g+3gb; (mod p), from which M;=6g+3gb; 
+k.p, where k; is an integer. Substitute these relations for M; and 2B; into 
the above congruence for 2/0;. Thus 


2p0; = 2kip + (mod 8), 
= ky + (mod 4). 

Since (30;, 4) =1, ¢; can be chosen such that 0;=1 (mod 4).* 
It has been shown that every integer s >C -3*”, where v= max (3, 5+1, m, 


mz), is a sum of nine values of (5) for values of X2|#|, the arguments of 
F(X) being 3m;, 3*—X;, 3*+X;, v; and w; (j=1, 2, 3). Thus every integer 
* A sufficient condition that Q; be representable as the sum of three integral squares is that Q; 


be not of the form 4°(8b+-7), where a20, 620, a and b being integers. See Landau, Vorlesungen iiber 
Zahlentheorie, p. 123, Theorem 187. 


| 
| 


502 G. CUTHBERT WEBBER [July 


s=C-3*+49a is a sum of nine values of f(x) =F(X)+e for <2|t| or x20. 
Theorem 1 is immediate. 

4, Functions (5) with p=3p1, p:4 2g (mod 3). The following theorem will 
be proved: 


THEOREM 2. Let the integers p, q and u satisfy the conditions p=3p1, pix 2g 
(mod 3), g= —3pit, t an integer, u=1 or (1—3p1—q)/3, and let a be defined 
by (4). Then for each such triple p, q and u there exist constants C and v such 
that every integer s=>C-3*+9a is a sum of nine integral values =0 of (3) for 
integral values =0 of x. 


For this section the function defined in (5) becomes 


3 
(20) G(X) = —X)+6X, 2). 


Let |¢| <3°, 5 being an integer 20. 
Lemma 4. For each integer s there exists an integer m’, |t| <3°<m’ <3*+3', 
such that s=G(m’) (mod 3"). 


When we note that g=u=1 or 2 (mod 3) according as G(X) comes under 
Cases III,, IV, or III, of §2, and denote m’’+3* by m’, k’ +3° by k, the proof 
of Lemma 4 is analogous to that of Lemma 1. 

Consideration of the possible values of p, g, and u as noted in IT, Ih, 
and IV gives g<1. 


Lemma 5. If n=5+1, gS1 and then G(m) 
<33"y, where 


+ 1)? + (3p: + 1)? + 3p: +1) +3). 
For, 
G(m) < + 1) + — {3*(3p, + 1) + 3°} ] 


+ 3*(3p: + 1) + 3? 
33n 
[(3p1 + + + 1)? + + 1) + 1] + 3%(3~ + 2) 
< 
The integer s may be written, by Lemma 4 and the method used to ob- 
tain (7) and (8), in the form 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 503 


(21) s=G(m)+3"M (M = M’' — Ean integer), 


where 


(22) E= + 3-3"y*z + 32"y? — y) + gy. 


The inequalities and equalities (9)—(19) inclusive along with the argu- 
ments relative to them are applicable to this section when F is replaced by G, 
and p by 3;. The constants 6;, be, bs; and C must satisfy 


27 10 9 
(23) + +74 7° + fi. 


For ~: = 1, 2, and 3 the following values of these constants satisfy (23) : 
pr | by be bs Cc 
1|5 7 11 505 


2|9 13 19 5330 
319 13 19 9061 


Case p:=0 (mod 3), p:=3¢e, e an integer =1. Choose 


b= S8e+1, bo =12e+1, bs = 18e+1, 
C = 6683et + 2100c? + 324e? + 


This set of values satisfies (23). This C is obtained from the polynomials in e 
which represent J; and J;/9 when the above values of 6;, bz and b; are sub- 
stituted in them. 


Case (mod 3), :=3e+1, an integer =>0. The values 
b= 8e+5, be =12e+7, bs = 18e+ 11, 
C = 6683e4 + 14,432e% + 11,505e? + 3992e + 505 
satisfy (23). 
Case p:=2 (mod 3), p:=3e+2, e an integer =0. The values 
bi = 8e+9, by =12e +13, by = 18e + 19, 
C = 6683e4 + 25,529e* + 36,223e? + 22,575e + 5330 


satisfy (23). 


The quantity Q; as defined by (13) with p=3p, is an integer. For, from 
(22), 


2E = (2g — pfidy (mod 34). 


504 G. CUTHBERT WEBBER [July 


Case p; is odd. From the definition of g, 2g=2u=1 or 2 (mod #;), and so 
(2g—p:, pi) =1. This, together with (2g—f1, 3) =1, gives (2g—f, 3p:) =1. 
Accordingly, by choice of y modulo 3p,, 2E may be made congruent to any 
assigned integer modulo 34, from which it follows that the same is true of E. 

Case is even, pi=2p2. Since g—p2=1 (mod p2) =1, and so 
(g—ps, 3p2) =1. Therefore, from this and E=(g—2)y (mod 32), it follows 
that E may be made congruent to any assigned integer modulo 32 by choice 
of y modulo 3/2. Write E=k+p-32, where k and p are integers, k being 
arbitrary and 0<k<32. Let E’ be the expression EZ with y replaced by 
y+3p2. Hence 


E! — E = 3p? + 3"+12(2y + 3p2) + 3°*(3y? + + — 1] + 


When f; is even, g is odd, and hence E’—E is an odd multiple of 32. Ac- 
cordingly, if we choose y modulo 3; we obtain for each value of & two values 
of p, one even and one odd. Thus there are 3p; values of E=k+p-3p2 (mod 
3p:), where 0<k <3p2, p=0, 1, and these values are incongruent modulo 3, 
each to each. Hence, by choice of y modulo 3,, E may be made congruent 
to any assigned integer modulo 3). 

Choose y such that E=M,’—B;—6g (mod 34,). This choice, according 
to (13), makes Q; an integer. 

As in §3, for sufficiently large, n>m2 say, D;>8. From (13) 69:0;=2M; 
—12g—2B; (mod 8/:). Using Lemma 2 we may make 2,(3);-3"—»,) =2¢; 
(mod 8), where {;is arbitrary. Accordingly, 2B;=6gb; (mod 81). By 
the above choice of y, M;=B;+6g=3gb;+6g (mod 39), and hence M; =6g 
+3gb;+h;-3p1, where h; is an integer. Substituting these expressions for 2B; 
and M; in the above congruence involving 6/:0;, we obtain 


6f10; = + (mod 
O; = hy + 3d; (mod 4). 


Choose v; such that the corresponding value of {; makes Q;=1 (mod 4). Ac- 
cordingly, Q; is representable as the sum of three integral squares. 

This completes the proof that every integer s =>C- where v= max (5+1, 
M,N2),and C has been determined, is a sum of nine values of G(X) forX = | ¢| , 
the arguments of G(X) being m;, 3*—X;, 3"+X;, v0; and w; (j=1, 2, 3). 
Hence every s=>C-3*+9ais a sum of nine positive integral values of f(x) 
given by (3) with p=3,, the arguments of the functions f(x) being derived 
from those above by means of x = X +#. Theorem 2 is immediate. 

5. Functions (5) for p=31, p:=2g (mod 3). This section deals with 
functions of the form (20) where the restrictions on /; are not as strong as 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 505 


those stated in Theorem 2. The results of this section include those of the 
last* but since the weight of the representation of integers sufficiently large 
has to be increased to ten, §4 gives better results for the special functions 
considered there. 

We prove 


THEOREM 3. Let integers pi, q and u be given satisfying the conditions p,=2g 
(mod 3), g= —3put, t an integer, u as in Theorem 2, and let a be defined by (4). 
Then there exist integers C and v such that every integer >C -5**+-10a is a sum 
of ten values of the function (3) with this triple pi, q and u as its coefficients, for 
positive integral values of x. 


The theory in this section differs from that in the preceding sections in 
three main particulars: 

(1) two values of the function G(X), instead of one, are subtracted 
initially from the integer (see Lemma 7), 

(2) the prime 5 is used instead of 3, 

(3) the interval in which s lies is divided into five sub-intervals instead 
of three. 

The fact that a lemma analogous to Lemma 1 cannot be obtained for the 
functions considered here, even with the modulus changed to any prime up 
to 23 inclusive, necessitates the first change noted above. 


Lemna 6. The positive integers s and n being given, there exist integers ki, ke 
and rt such that s=G(ki+2r-5") +G(ke+27-5") (mod 5"), where 05k: <5", 
<5", O<7r<5. 


This lemma is proved by induction on n. For »=1 we considered all pos- 
sible combinations of values of ; and g modulo 5 and showed in each case 
that it is possible to choose integers k; and ke for which s=G(k:) +G(he) 
(mod 5), 


pil3k? + 3k? — 2) +4g40 (mod5), OS& <5 and OS &<S. 
For example, if :=1, g=1 (mod 5), then 
0 = G(0)+ G0), 1=G(1)+G(2), 2=G(1) + G1), 
3 = G(4) + G(4), 4=G(4) + G(2) (mod 5); 


these values of &; and ke satisfy the conditions stated. Now, as the induction 
hypothesis, let the integers and kez exist such that s =G(k:) +G(ke) +2-5", 


* The constants 5; (i=1, - - « , 5) and C are not calculated here for #:=0 (mod 3). This would 
have to be done before the results of Theorem 3 could be applied to all functions considered in §4. 


| 


506 G. CUTHBERT WEBBER [July 


k being an integer, where 0 OS <5", and pi(3k? +3k? —2)+4g40 
(mod 5). Then, if is an integer, 


G(ki + 27-5") + G(ke + 27-5") 


(24) =s+5"[{pi(3k2 + 3k2 — 2) —&] (mod 5*4). 


Since, by the hypothesis for the induction, the coefficient of r in the square 
bracket of (24) is prime to 5, we may choose r modulo 5 such that the coeffi- 
cient of 5* is congruent to zero modulo 5. Thus 


s = G(ki + 27-5") + G(ke + 27-5") (mod 5***), 
The induction is complete. 
Substitute 4; =,+27r-5" and he =ke+2r-5" in Lemma 6. We obtain 


Lemma 7. For any given integers s and n there exist integers h, and hz such 
that s=G(hy) +G(he) (mod 5”), where <9-5" and <9-5". 

Choose 6 and such that |¢| and n2=6. Let m=m+5* and m=/y 
+y-5", where 1<y<3p;. Substitution of m and mz into G(X) gives G(m) 
=G(h) (mod 5") and G(mz) =G(/z) +5"E, where 


(25) E= + 3-5"hay? + — y) + gy. 


Combining these results with Lemma 7 we obtain 


s = G(m) + (mod 
s = G(m) + G(he2) + 5"M’ = G(m) + G(me) + 5"M, 
where M’ is an integer, M=M’—E, and |t| <5"<m,<10-5*, |¢| <5"<mz 
<(3pi.4+9)5*. 

Lemma 8. If n21, g<1, 5"°Sm,<10-5", m2 <(3p:+-9)5", then—2a 
<G(m) +G(me2) <5°~y, where 


(26) 


(27) + + 1002]. 


The statement g<1 in §4 holds here. Substitution of the upper bounds 
for m, and mz into G(m) and G(me) gives 


1001 pr 
G(m) < p53", G(m2) < + 9)3 + 1]5%. 


Since f(x) =G(X)+a20, then G(X) = —a. The statement of the lemma fol- 
lows. 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 507 


As stated formerly, the interval C-5°"<s<C-5%"+8 is subdivided such 
that 


(28) ZHIC. 58" < 5, < 31-53" (i = 1, 2, 3, 4, 5). 


The following replacements will transform the relations used in §3 into 
those used here. Replace 3" by 5* and p by 34; throughout; in (9) replace 
a by 2a; replace 1; +w;=3);-3" by v;-+w;=50;-5"; in (14) replace 3 by 5; 
let V;=v;—5b;-5"/2. As a result we obtain inequalities corresponding to (19), 


125 78 125 
(29) + +va3C +h (@=1,---,5). 


The constants 6; (i=1, - - - , 5) and C are chosen in accordance with (29) 
as follows: 


pi=1 (mod 3), 
b, =8e+9, bo =12e+13, bs =18e+19, bg =24e+27, =36e4+39, 
C =30,497e* + 106,839? + 134,404e?+ 70,596e+12,759; 


pi:=2 (mod 3), the same 0b; as for p:=1 (mod 3), and 
C =30,497e4+-117,126e* +-167,074e? +-107,572e+-27,165. 


The quantity Q; is an integer. For, from (25), E=gy (mod 34;). Also 
g=1 (mod 3f;) or 2g=1 (mod 39;) according as the cases being considered 
are III, and IV or III,; thus (g, 3f:) =1. Accordingly we may choose y such 
that E is congruent to any assigned integer modulo 3),; choose y such that 
E=M,,' —B;—6g (mod 3,). From (13) with the above replacements we see 
that Q; is an integer. 

To prove that Q; is representable as the sum of three integral squares, 
we proceed as follows. Equation (13) with p=3%, and 3 replaced by 5 gives 
6:10:=2M;—12g —2B; (mod 8,). The replacements described above give 


B; = sh| — 3y,(5b;-5" — — 1} + 


and thus 
2B; = 10gb; — 30p15;¢; (mod 81), 


where ¢; arises from the use of Lemma 2, as described in §3, and is arbitrary. 
Also, 


Sbipi 
B; = 1) + Sbig (mod 


Since, according to the last paragraph, M;=B;+6g (mod 34,), then 


1 
fi 
d 
| 
q 
3 


G. CUTHBERT WEBBER [July 


5b; 

M; = 6g + Sbig + “(b2 — 1) + 3kip: (k; integral). 

This gives 
67:10; = + (mod 8), 


0; = k; (mod 4). 


Choose ¢; such that 0;=1 (mod 4). 

It has been shown that every integer >C-5**, where C and v have been 
determined, is a sum of ten positive integral values of the function (20), p; 
and g being given, for integral values of X =|#|. The statement of the theo- 
rem follows. 

6. Cubic functions without square term. L. E. Dickson, in his Transac- 
tions paper, did not consider the Waring problem for functions 
f(x) =x+e(x*—x)/6 where € is a multiple of 3. Frances Baker* considered 
the problem for functions of the above form where e=3a, a odd and a=1 
(mod 3). 

The work contained in Chapter I of Miss Baker’s thesis, with two or 
three minor changes, holds equally well when the only restriction on a is a42 
(mod 3). For a=3e the following constants };, be, bs; and C satisfy the in- 
equalities in her paper corresponding to (19) of this paper: 

= 14e +9, be = 13, = 30e + 19, 

C = 30,497e + 57,713e8 + 36,552¢? + 7719e + 14. 
The proofs which it is necessary to change are contained on pages 12 and 13 
of her paper, these changes being in accordance with similar work contained 


in the previous part of this paper. This results in the following theorem, 
Miss Baker’s results being included: 


THEOREM 4. To each positive integer a, a4#2 (mod 3), there correspond posi- 
tive integers C and v such that every integer =>C-3* is a sum of nine values of 


(30) f(z) = 2+ x) 


for integral values =0 of x. 

Consider the problem for the functions (30) with a=2 (mod 3). The re- 
sults are stated in 

THEOREM 5. To each positive integer a, a@=2 (mod 3), there correspond 
positive integers C and v such that every integer =>C-5* is a sum of ten values 
of (30) for integral values =0 of x. 


* A Contribution to the Waring Problem for Cubic Functions, Doctoral Dissertation, University 
of Chicago, 1934. 


508 


1934] WARING’S PROBLEM FOR CUBIC FUNCTIONS 509 


The proof of Theorem 5 parallels that of Theorem 3, the major changes 
being necessitated by the requirement x20 instead of X=|#| as heretofore 
required in this paper. This is due to the fact that it is not necessary to trans- 
form linearly our original function into another without square term before 
the theory is applied. By choosing m; = /, instead of m; =/,+5*” we lower the 
upper bound for G(m) as contained in Lemma 7. The inequalities corre- 
sponding to (14) would be 


0< 55;-5" and 050; 5. 


Choose the constants b; ({=1, - - - , 5) and C so that 


125 125 


are satisfied, the choice being 
b= S8e+5, be=12e+7, bs = 16e+11, by = 24e +15, 
bs = 34e + 21, 
C = 27,365e* + 66,465e? + 62,661e? + 27,136e + 4681. 


The remainder of the proof is so similar to that of Theorem 3 that it is 
needless to repeat it here. 

Theorems 4 and 5 complete the general theory for cubic functions with- 
out ‘square term. 

7. Universal theorems. A universal theorem for weight nine is possible 
for only two functions of the type considered in Theorems 4 and 5. The func- 
tion (30) has the values 


= 0, =1, f(2) = 2+ 3a. 


If f(2)=12, the integer 11 has a representation of weight eleven as a sum 
of functions (30), i.e., 11=11 f(1). Accordingly, a=1, 2, and 3 are the only 
values of a for which universal theorems of weight ten are possible. The case 
a=1 was considered by Miss Baker* and a universal theorem of weight nine 
was obtained. The case a =2 reduces to the problem of cubesf for which the 
result is well known. For a=3, f(2) =11, and 21 =/(2) +10f(1), a representa- 
tion of weight eleven. 
We prove 


* Loc. cit. 
+ L. E. Dickson, Simpler proofs of Waring’s theorem on cubes, with various generalizations, these 


Transactions, vol. 30 (1928), pp. 1-18. 


= 
q 
4 


510 G. CUTHBERT WEBBER 


THEOREM 6. Every integer =0 is a sum of fifteen values of 
(31) f(x) = 23 + 3(x? — x) 
for integral values =0 of x. 

This function (31) is the function (3) for p=3p1, p:1=2, g=6, u=1. These 
values of p and p; satisfy the hypothesis of Theorem 3, and so, for v suffi- 
ciently large, every integer >C-5*+-50 is a sum of ten values of (31). The 
integers C and v calculated from the theory are C = 27,165 and y=8. We have 
to prove that all integers <27,165574+50 can be represented as a sum of 
fifteen values of (31). 

A table of minimum weights of the representations of integers 1-1000 
shows that integers 298-1000, inclusive, have weight 8; 169, 83 and 41 are 
the largest integers of weights 11, 13 and 15, respectively. 

Apply the following theorem* to the data given above: 


THEOREM 7. Let a polynomial f(x) take integral values =0 for all integers 
x =0; let f(x+1) —f(x) increase with x. Suppose that every integer n for which 
l<n<gt+f(0) is a sum of k—1 values of f(x) for integers x=0. Let m be the 
maximum integer for which f(m+1)—f(m)<g—l. Then every integer N for 
which 1+-{(0) <<N <g+f(m-+1) is a sum of k values of f(x) for integers x=0. 


Seven applications of Theorem 7 lead to the result that all integers < a 
constant greater than 27,165 5*4+50 are sums of fifteen values of (31). The 


proof of Theorem 6 is complete. 

8. Generalization.j In this section we show that the theory contained 
in §§3, 4 and 5 holds for values of the parameters p, g and u of (3) subject 
only to the condition g = —tp. We shall consider §3. If, in Lemma 3, we use 
g< |g| in place of g<13p+1, the only effect is to alter certain terms of (18). 
When we pass to (19) these terms drop out, so the same set of bi, be, bs and C 
will suffice. In the proof that Q; is an integer there is the restriction (g, p) 
=0=1. However, if @>1, F(X), and likewise any sum of values of F(X), 
would be a multiple of 6. As stated at the beginning of §2, functions of this 
type do not enter into the theory. Accordingly, the parameter g is arbitrary 
and the result stated above follows for §3. Similar results for §§4 and 5 are 
immediate. 

*L. E. Dickson, Waring’s problem for cubic functions, these Transactions, vol. 36 (1934), pp. 
1-12. This is Dickson’s Theorem 3. 


f Section appended June 30, 1934. This is analogous to theory recently obtained by L. E. 
Dickson. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


- 


THE APPLICATION OF THE THEORY OF ADMISSIBLE 
NUMBERS TO TIME SERIES WITH 
CONSTANT PROBABILITY* 


BY 
FRANCIS REGAN 


I. INTRODUCTION 


The idea of admissible numbers in probability is a new one. Its develop- 
ment has come about during the past few years through the researches of 
Copeland.} Admissible numbers furnish a method for testing the consistency 
of the assumptions of the theory of probability and also serve as a guide for 
setting up sets of assumptions. The problem of testing consistency is a very 
extensive one, since in almost every branch of the theory of probability new 
assumptions are made. 

This paper is an extension of the concept of admissibility to time series. 
A time series is a sequence of occurrences, which are represented by a set of 
points on a time axis. These points must satisfy a certain law; namely, that 
there is a definite probability of getting a point in any interval. This proba- 
bility may vary according to the length of the interval, or according to the 
length of the interval and the position of the interval. 

In order that a probability situation may have any meaning from the 
statistical point of view, it must be capable of being repeated a large number 
of times under similar circumstances. A type of time series which is dealt with 
here has these repetitions given directly by the time series. In order that this 
may be the case, the probability of obtaining a point in any interval must be a 
periodic function of the position of the interval; that is, the coordinate of the 
left hand extremity. 

Since this set of points possesses the property necessary to use the statis- 
tical point of view of probability without any modifications, then if it is to 
satisfy the fundamental assumptions of the theory of probability, it is neces- 


* Presented to the Society, April 15, 1933; received by the editors October 4, 1933. The author 
wishes to acknowledge his appreciation to Arthur H. Copeland, of the University of Michigan, for 
many helpful suggestions throughout the progress of the work. 

t A. H. Copeland, Admissible numbers in the theory of probability, American Journal of Mathe- 
matics, vol. 50, No. 4, Oct., 1928. Independent event histories, loc. cit., vol. 51, No. 4, Oct., 1929. 
Admissible numbers in the theory of geometrical probability, loc. cit., vol. 53, No. 1, Jan., 1931. The 
theory of probability from the point of view of admissible numbers, Annals of Mathematical Statistics, 
vol. 3, No. 3, Aug., 1932. 


511 


512 FRANCIS REGAN [July 


sary that sequences of successes and failures be represented by the digits of 
admissible numbers. The number of conditions imposed upon these points 
has the power of the continuum, since, for every interval, a different set of 
conditions is obtained. It is the purpose of this paper to show that these con- 
ditions are consistent and can be satisfied. 

The time series will be represented by the set of points 71<72< - - - 
<ti< +--+. Let f(a, 7, é) be the probability of a points of the series lying 
in an interval of length r, beginning at time ¢. It follows from the past dis- 
cussion, that it is necessary for f(a, 7, ) to be periodic in ¢. The Ath interval 
I, of the series is defined as t+(k—1)A<hSt+7r+(k—1)A, where A is a 
period or an integral multiple of a period and ¢t+7SA. Let x(a, 7, t, A) be a 
number such that its kth digit is one if there are exactly @ points in J; and 
zero otherwise. Since this paper deals with constant probability, it may be 
seen that the probability of a points of the series lying in 7 is independent of #, 
and that every A is a period. In this case, the time series 71<7T2< - - - 
<1ti;< +--+ will be constructed so that the number x(a, r, ¢, A) is an element 
of the set A [f(a, 7, |* for every a, 7, A, where tr =m-2-¢+!, t=r-2-e+1 
and A=p-2-*+1, where a, p, m, r and a are positive integers and r+m<p.f 
The construction of this series that will be given is, in a sense, a numerical 
construction and is probably about the simplest mathematical construction 
possible. 


The case where the probability of a points lying in r is dependent on #, 
will be discussed in a subsequent paper. 


II. PROBABILITY OF AT LEAST ONE POINT LYING IN AN INTERVAL 


1. When the probability does not depend upon the beginning time, it is 
seenf{ that the probability of a points lying in an interval of length 7 is 


f(a, 7,0) = 


Since m is the ratio constant which determines the unit of time, there will be 
no loss in generality by choosing m equal to one. 

For the present, we shall be concerned with the case in which at least one 
point lies in r. A geometrical construction of this time series may be formed 


* The set A [ f(a, 7, t) | is the set of all admissible numbers associated with the probability function 
t). 

Tt It should be observed that the function f(a,r, ¢) is independent of ¢, hence f(a, 7, )=f(a, 7, 0). 
Also the number x(a, r, ¢, A) is independent of #, since it is an element of the set A [f(a, r, #)]; that is, 
the numbers x(a, 7, 0, A) and x(a, 7, t, A) are members of the same set A [ f(a, r, 0)], provided that 
A=p-2-°*!, in the first number and 2~°*! in the second, wherer+m<X p. 

t See Fry, Probability and its Engineering Uses, pp. 216-27 and pp. 232-35. A more rigorous proof 
than given by Fry will be published later i: a joint paper by the author and Professor A. H. Copeland. 


» 


1934] TIME SERIES WITH CONSTANT PROBABILITY 513 


to illustrate the phenomena arising from the probability function [1—/(0, 7, 
0) ]. Construct a set of segments of length 7 on a line. A set of points 7: <72 
< +++ <r;< +--+ is distributed along this line in such a manner that the 
probability of at least one point lying in an interval of length 7 is [1 —f(0, r, 
0)]. Here the Ath interval J, is (k—1)r<h<kr.* In I;, there may be no 
points or at least one. Let us define ~x(0, 7, 0, 7) such that its kth digit 
is one if there is at least one point in J; and zero otherwise. The success ratio 
N x(k) 


and we demand that p(~«,) =1—/(0, 7, 0), where 


b(~ = lim py(~ 
Now 


A physical illustration of a time series of this character is the occurrences 
of quakes in a certain region. Let us take for example, the quakes actually 
occurring from 1490 A.D. to 1930 A.D. in the region which is now Mexico, 
assuming that records of such could have been kept. These data would be ap- 
plicable to the above. Let 7 represent the time span of ten years and the set 
of points properly graphed the quakes. Here ~z, will be defined such that its 
kth digit is one if there is at least one quake in the kth decade and zero other- 
wise. The success ratio is 

44 
pul~ “4 


If these records of quakes were kept up indefinitely this success ratio would 
approach (1 

2. We have constructed a time series which illustrates exactly how a 
physical event would be dealt with. It becomes our problem to build up an 
imaginary series which logically follows from the physical, but one that satis- 
fies the laws of admissibility. In testing the consistency of the assumptions 
made in developing the probability function, the time series must possess 
certain properties, one of which is an unlimited number of occurrences. These 
properties will be exhibited in the next paragraph where the series will be 
developed. 

The time series may be represented by a set of points 11<7T2< - + -<7; 

* Here ¢=0 and r=A. 

7 The symbol ~ means “not.” Let —«(0, 7, 0, 7) be represented by x;. 


t See Copeland, Admissible numbers in the theory of probability, American Journal of Mathe- 
matics, vol. 50, p. 536. 


514 FRANCIS REGAN { July 


< +--+ on the positive r-axis. These points satisfy the following law. Let 
this axis be divided into periods or intervals and let the kth interval J; be 
defined as (k—1)p-2-*+!<hsk-p-2-¢+!, where p and a are positive integers. 
We are considering the case where /=0, A=r =p-2-°+!. Let ~x, be such that 
the kth digit is one if there is at least one point of the time series in J; and 
zero otherwise. The points of the time series must possess the property that 
p|~x,]=[1—e-7] for every p and oc. 

3. We shall now construct the imaginary series. We shall construct a finite 
set of points of the series in such a manner that the conditions described 
above are satisfied to a certain degree of approximation when applied to unit 
intervals. We will call this set of points the first stage. The set of points for 
the second stage will be selected in such a manner that the conditions will 
hold to a certain degree of approximation when applied to unit intervals or 
intervals of length one half. For the third stage, the conditions can be applied 
to intervals of length 1, 3, or }. We let N, be the number of intervals em- 
ployed in the first stage. In the second stage, there will be V2 unit intervals 
or 2- Nz half unit intervals, etc. The choice of numbers Ni, No, -- - , Nz, - 
will be determined at a later point in the paper. We shall construct the 1; 
contiguous unit intervals of the first stage on the r-axis, beginning at time 
zero, the origin. The points of the time series will occupy the mid-points of 
these intervals. Let X,; be a member of the set A(1—e7') and let the 7th unit 
interval contain a point of the time series if and only if the ith digit of X; 
equals one. If 7; is the smallest number j such that j- p;(X1) =i, the points 
which we have constructed have the coordinates r;=7;—}. We have now 
constructed the first Ni - py,(X1) points of the time series. 

To the N, unit intervals we will add 2-N, intervals of length 3. We will 
allow the next set of points of the time series to occupy the mid-points of these 
intervals. Those intervals which will contain points of the time series will be 
determined by the digits of a number X2 which is a member of the set 
A(i—e-*"). In general the sth set of points will occupy the mid-points of 
2*-!. N, intervals of length 2-*+!. Those intervals which will contain points 
of the series will be determined by the digits of a number X, which is a 
member of the set A(1—e-*”’). 

Let +N,-1 (v; =0) and 


e-l 


Ye = Ng: (ym = 0). 
kal 


Then after the sth set of points has been chosen, we have determined 7.4: 
points and these points all lie in the interval from 0 to v,4:. Let s; be such that 
Yo, <i ZYe,41 and let 7; be the smallest 7 such that j- p;(X.,) =i—vy.,. Then the 
coordinates of the points of the time series are given by the equations 


TIME SERIES WITH CONSTANT PROBABILITY 


(1) Ti = Vs; — 2-81, 


For example suppose that i=y2+1; then s;=2 and the point 7; lies to 
the right of the point ve. Let us suppose further that 3 is the smallest integer 
j such that j-p; (X2) =i—y.,;=1. Then 7; is the mid-point of the third half 
unit interval to the right of the point v2; that is, 7; =v2.+3-2-!—2-*. 

Let us consider a further example. The time series consists of the sequence 
of points 71, 72, -- -,7:, - and the associated number ~x(0, 1,0, 1) 
has the following sequence of digits: 

(2) (ND, @), 4) 


Xi Xi (Xe v )(X2 )- 


where X. v X, =1 whenever one or both of the numbers is equal to one, 
and v =0 otherwise, etc. We have the equation =1—e7! and 
we will prove that 


p[(1/2)X2 v (2/2)Xe] 
= p[(1/4)Xs v (2/4)Xs v (3/4)Xs v (4/4) Xs] = 


We will show that, for a proper choice of the numbers N,, No, - - - 
p(~a%) =1—e7!. The numbers ~21;2, ~1/4, etc. can be treated in a similar 
manner. 

4. The time series is defined when the numbers JV, are defined. We shall 
prove the following theorem. 


THEOREM 1. If the time series consists of the points 71, T2, 73, - - - satisfying 
the conditions 


Ti = 0; + 2-841 — 2%, (v4 0), 
s; is such that y.;,<i 


s—l 


Ys =), Qk-1 Ni porn, (Xx) (11 = 0), 
k=l 
X, is a member of the set A(1—e-?”’), 
ji is the smallest j such that j-pj(Xs,) =i—Ys,; 
then the numbers N,, N2,---, Ns, - - - can be so chosen that for every r satis- 
fying the conditions 
r= 


H 
(Ht) p and o are positive integers and 0<p<2*-, 


the corresponding number~x, is an element of the set A(1—e~). 


1934] 515 


516 FRANCIS REGAN [July 


In order to prove this theorem, it is necessary to divide the time axis into 
periods of length r. Each period is composed of p intervals, whose lengths are 
the same as those at the o-stage, and the kth period J; is (k-—1)r<ASkr. At 
this stage the number representing the event which succeeds if there is at 


least one point in J;,, is 
p 
p 


which is an element of A(1—e~).f At the (¢+1)-stage, in order to represent 
the same /;, it is necessary to increase the number of intervals by p, and 
hence double the number of operators. At this stage the required number is 


which is also a member of the set A(1—e-*). Continuing this process to the 
s-stage, we have the number 


which is an element of the set A(1—e~*). Let this number be represented by 
U,. When s2e and »,/7t <k Sy,4:/7, then there exists at least one point in J; 
if and only if the digit k—v,/r of U, is one. Hence, when s2o the digits 
(v./7) +1 to v.4:/7 of ~x, are the same as the digits 1 to N,/7 of U,. We shall 
now show that this is actually the case when s2c. 

The numbers JN, heretofore mentioned are chosen so that »,/(nr) is an 
integer, if o<s, and 0<p<2*-!, from which follows that N,/(nr) is 
an integer. If the kth digit of X, is one, then the kth interval at the s-stage 
contains a point of the time series. There are 2*-!- NV, intervals, and hence we 
use 2*-!- NV, digits of X,. We may express X, as 

from which we may form 


(p-2° 741) ({Ns/r—1) p-2° +41) 


X, 


* This symbol is used to represent the number{ (1/p)X_ v (2/p)X,v (p/p)Xe}.In general, 
let ¥ v } represent the number {(1/n)¥v --- v (n/n)Y}. 
t See Lemma 1 in the second parag:aph below. 


1 

| = ) x! 

. . . . . . . 

x. Me 


1934] TIME SERIES WITH CONSTANT PROBABILITY 517 


We are considering N,/r digits of each of these numbers. It may be seen from 
these numbers that the first digit of U, is one if and only if at least one of the 
digits X,, X¥,---, X{*-*™ is one. By analyzing the digits X to 
X{°-*™” inclusive, it is seen that if at least one of these is one, then there is 
at least one point of the time series in the first period of the s-stage; that is, 
of the interval I,/)4:. Hence the {(v./r)+1}th digit of ~x, is the same 
as the first digit of U,. This process may be continued for the N,/r digits of 
Previously, we made use of the following lemma: 


Lemna 1. If X, is an element of A(p), the number {>~7-1(q/n)X.v } is a 
member of the set A{1—(1—p)*]. 


We know that 


(ne 


q=1 


Since the numbers (¢/n)X, (g=1, 2, - -, are independent, then ~(g/n)X, 
(q=1, 2, - - - ,m) are independent.} From (a), we get 


= 


\% 


and hence, we have 


q=l 


q=1 \% 

We see that >°7-1(g/n)X.v has the desired probability, but we must now 
show that 


* Here the symbol II~(q/n)X,- represents the number { ~(1/n)X,° +++ “(n/n)X,}. 
Such symbols used in this paper will have similar meanings. For the truth of this equality, see Cope- 
and, The theory of probability from the point of view of admissible numbers, Annals of Mathematical 
Statistics, vol. 3, Aug., 1932, p. 149. 

+ See Copeland, Admissible numbers in the theory of probability, American Journal of Mathe- 
matics, Oct., 1928, Theorem 5, p. 542. 


518 FRANCIS REGAN [July 


for every set of numbers, 72, - - , 7x, such that 0<7;< mand if i¥7. 
Since 


~~ 


q=l 


mn 


The numbers 7; are chosen such that for every set ri, re, - - - , rx, we have 
0<r;Sm and r;¥r; if Then the numbers ~{ [g+(r;—1)n]/(mn)} X, 
are independent, and hence the numbers { } X,- 
are independent. We now conclude that 


Er} 


Therefore the lemma is proved. 
The following fundamental lemma will enable us to determine the choice 
of the numbers JN,. 


Lemma 2. If there exists a sequence U,, U2,---,U., +--+ and a monotonic 
non-increasing sequence €, such that lim,... €,=0, and if H,, Hz, ---,H,, - 
and ++ are two sets of integers such that 


(a) | — | < «./3, if N= A,, 


(b) | pw(U.) — p| + [ue + < if 
where s=Jit+Jo+ +Js-1 (ui =0), and if x is such that its digits p,.+1 to 


Me+1 are the same as the digits 1 to J, of U., where soSs, then 
(c) | pw(x) — p] <4, 
provided SN and SoSs. 
* See Copeland, Admissible numbers in the theory of probability, American Journal of Mathe- 


matics, Oct., 1928, p. 539. 
¢ Since these numbers are independent, their negations are also. 


q=1 n 
then 
|-~ 
Then 
and 


1934] TIME SERIES WITH CONSTANT PROBABILITY 519 


By hypothesis, the digits u,+1 to u.4: of x are the same as the digits 1 to 
J, of U,, where so<s. Hence, we have 


and 
(1) (2) (k) (Js) 


where the x and u,™ are ones or zeros and get?) 
gee) =y,J+, where soSs.* 


Since 
(k) 


(d) = and ps(U.) = 


k=1 k=l 


it follows that 
| pw(x) — Ji, 
when 2 ps1. 
Hence 


N- 


if N +d,. 
Combining (b) and (e), we obtain 


(f) pv(x) — < <| 
But 


if Meri SN +H,. 

Therefore, adding (f) and (g) we get 


(h) | pr(x) <4, 


if 
We also know that 


(k) 
N—ws+1 


k=l 
provided NW 


* It may be noted that the first u,, digits may be arbitrarily defined for x. 


3 
<—) 
3 


520 FRANCIS REGAN 


Using (d) and (i), we may form 
(j) | Npw(x) — — (N — wets) | 


and find for what values of N this expression will be less than Ne,/3. We find 
that 


(k) | pw(x) — [J./N]ps.(U.) — ([N — wets | 
< [NV J, — (N — wert) = «/3, 


if Ms+1 +H, N S 
Combining (a), (b) and (k), we obtain 


(1) | px(x) - [{J. +N-— evi} /N]p| < [2 — u./N]e,/3 < 
But 
(m) [{J.+ N — werr}/N]p| = (ue/N)p < 


if SN Spee. 
Adding (1) and (m), we get 


(n) | pw(x) — p| < &, 


if s SN 
It follows from (h)-and (n) that 


| pw(x) — < 4, 


provided N where so Ss. 

In order to prove Theorem 1, we shall put it in the form of the funda- 
mental lemma. 

Since U, is an element of A(1—e~), it follows that 


Let €2, , &, beadecreasing sequence of positive numbers hav- 
ing the limit zero. We can choose two sets of integers Mi, Mz, ---,M.,--- 
and N;, such that 


whenever N =>M,/(nr), and 


i=l N, 


[July 
f 
M, 


1934] TIME SERIES WITH CONSTANT PROBABILITY 521 


when N2N,/(nr), and where »,=Ni+N2+ - - - +N, 1. The sequence M, 
must be chosen so that M,/(nr) is an integer. It has been stated that v,/(n7) 
and N,/(nr) are integers. 

It is understood that conditions (1’) and (2’) hold for every set of posi- 
tive integers n, o, p, , Tx, Such that mSs, risn 
and k <n, where r;¥7; if i¥j. 

The first v,/r digits of ~x, are found directly from the time series, for 
instance the first digit of ~, is one if there is at least one point of the time 
series in J;. When o<s, we have shown that the digits (v,/7)+1 to ve4:/r7 
of ~zx, are the same as the digits 1 to N,/7 of U,. Then the digits (v,/(mr))+1 
to Ye4:/(nr) of n)~x,-] are the same digits as 1 to N,/(nr) of 

We may now make the comparison of the theorem with the lemma as 
follows. The numbers 


i=l n nT nT 


G)~=| 


have taken the places of U,, p, Hs, Js, us, So and x respectively. The numbers 
N, have now been selected and hence the time series is determined so that 
for every p and a, ~2, is an element of A(1—e7~). 

5. Since a time series has been obtained so that the conditions of admissi- 
bility are satisfied in the period 7, it is natural to inquire whether similar 
conditions hold when, within the period 7, sub-intervals of the form 2-*+ are 
omitted from consideration. 

We shall consider m intervals of the form 2-’+ in the period r. Let these 
intervals begin at p;-2-°+1,7=1, 2,---, m,wherepn+1Sp and m<p. The 
sub-intervals of J; will be defined as follows: (k—1)r+p;-2-°+!<hs(k—1)r 
+(p:+1)2-+!, 2,---, m. Corresponding to of §2, we defined a 
number ~z,-* such that its kth digit is one if there exists at least one point 
of the time series in the m intervals of J; and zero otherwise. The points of 
the time series must possess the property that p[~z,-] is equal to (1—e~’), 
for every integral value of m, o and p, where m<p. 

Since the same type of time series is used for this case, then this series is 
defined when the corresponding numbers VN, are defined. 

With these remarks, we come to 

* This symbol written in full is ~x(0, 7’, ¢, A), where r’=m- 2-+1, t=0 and A=r=p: 


+ The numbers N, used here are not necessarily the same numerically as in the preceding dis- 
cussion. 


N, Vs 
and 


522 FRANCIS REGAN [July 


THEOREM 2. If the conditions (H,) of Theorem 1 are satisfied, then the num- 
bers Ni, No, ---,N, can be so chosen that for every r and r’ satisfying the con- 
ditions 

T= 


where m, p and o are positive integers, m<p and 0<p<2*-, the corresponding 
number ~x, is an element of the set A(1—e~’). 


In proving this theorem, we employ the following scheme. The number 
which characterizes the m intervals at the o-stage for the event that succeeds 
if there is at least one point in the m intervals of J;, is 


{ [(er + 1)/o]X. v + 1)/p]Xev [(om + 1)/p] Xe}, 


which is an element of A(1—e-*’), which follows from Lemma i. At the s- 
stage, the number that characterizes this event for the m intervals of J; is 


| [(os- + 1)/(p-2*-*)] v + 2)/(p-2°-*)] 


t=] 


which is an element of A(1—e~*’). Let this number be represented by W,. 
When s2o and »,/r <k Sv,4;/7, then there exists at least one point in the m 
intervals of J, if and only if the (k—v,/r)th digit of W, is one. Hence, when 
s2a, the digits (v,/r7)+1 to v.4:/7 of ~x, are the same as the digits 1 to 
N,/r of W,. The numbers N, are chosen so that v,/(mr) is an integer, if n<s, 
oss, and 0<ps2™-!, 

Let e, be a decreasing sequence of positive numbers having the limit zero. 
We can select a set of integers M,, M2, ---,M,,---,suchthat 


if N=M,/(nr), where the sequence M, has been chosen so that M,/(nr) is 
an integer. We can choose a second set of integers No, ---, Nz, - - 
that 


when N2N,/(nr) and »,=Ni+N2+ 


<—» 
3 


1934] TIME SERIES WITH CONSTANT PROBABILITY 523 


Conditions (a) and (b) hold for every pi, p2,---, Pm, m, n, p anda 
such that pn» +1<p, o Ss, and for every set of num- 
bers, r1, 2, , Tx, Such that r; Sn, k <n, where if 

When o Ss, we see that the digits (v,/r) +1 to v.4:/7 of ~x,, are the same 
as the digits 1 to N,/7 of W,. It will be seen from the definition of ~x,-, that 
the digits 1 to v,/r of ~x,, have been determined from the m intervals of J; 
to The digits (v./(nr))+1 to veysi/(mr) of are the 
same as 1 to N,/(nr) of [I‘-1(r:/n)W.- |. 

The fundamental lemma may now be applied. The numbers 


and [11 | 


i=l 

have taken the places of U,, p, Hs, Js, us, So and x respectively. This shows 
that the numbers WN, have been chosen so that the time series is definitely 
determined in such a way that for every m, p, and a, where m<p, ~2, is 
an element of A(1—e~’). 

If m=p, then p,:=0 and this case reduces to Theorem 1. Furthermore, 
if p:=r and p»=r+m—1, then ~x, becomes ~x(0, 7, t, A) where =m-2-+, 
t=r-2-°+! and A=p-2-*+!, 


III. PROBABILITY OF a POINTS LYING IN AN INTERVAL 


6. The same type of time series that was defined in §2 will be used here. 
As formerly, the intervals are closed on the right and open on the left. The 
kth interval J, is (k—1)r<h<kr. Let us now define the number x(a, 7, 0, 7)* 
such that its kth digit is one if exactly a points lie in J; and zero otherwise. 
The points of the time series must be distributed on the time axis so that 
p(x, | is equal to [(r*e)/a!], for every integral value of p, o and a. 

7. As before, the numbers NV, must be chosen so that a consistent time 
series will be defined. The fundamental theorem here is the following: 


THEOREM 3. If the conditions (H;) of Theorem 1 are satisfied, then the num- 
bers Ni, N2,---,Ns,- +--+ can be chosen for every r satisfying the conditions 


(H2) 7 = 
p, a and o are positive integers, where 0<p<2*-", so that the corresponding 
number x, is a member of the set A [(r*e~*)/a!]. 
Let us define p, 7, 8, and a, such that 0<p<2”-!, r=p-2-¢t!, B,=p-2°" 
* Let x(a, 7, 0, 7) be denoted by x;. 


524 FRANCIS REGAN [July 


and aS§,. At the s-stage there are 6, divisions of r. Let us consider an event 
which succeeds if there is a point in each of a such intervals and no points in 
the remaining (8,—a) intervals. The number corresponding to this situation 


t=1 Bs Bs 
where 9;q; if iz’ in every term of the symbolic sum (v). This number U, 
is admissible,f and is a member of the set A [s,C.(1—e7*/#*)=(e—7/6s)6e-2], 


Since 
lim g,Ca(1 — = 


we can choose s and JN in such a manner that the difference 
k 
pv} II (=) — {[r2e-*]/a!}* 
t=1 
is arbitrarily small. We have to satisfy conditions analogous to (1’) and (2’) 
of §4 but since 


we have an additional problem. The sequence e, can no longer be arbitrary 
and the conditions arialogous to (1’) and (2’) cannot hold for every s which 
is greater than or equal to a. We will show that there exists a monotonic non- 
increasing sequence ¢, such that lim,.., ¢,=0 and two sequences of integers 
Mi, M2,---,M.,--- and M, No,---,N.,--+ anda function f(s) with 
an inverse f-'(c) such that if o<f(s) or o<f-(c) Ss, then the condition 
analogous to (1’) and (2’) holds.§ When s=f-1(c) =o and 
then there will be exactly a points in J; if and only if the digit (k—v,./r) of 
U, is one. Hence, when s=f-!(c) 2a, the digits (v,/r)+1 to v.4:/r of x, are 
the same as the digits 1 to N,/7 of U,. We shall now show that this is true. 
The numbers J, referred to in §3 are chosen so that v,/(nr) is an integer, 
when n<f(s), o<f—(c) Ss and 0<p<2*-", from which we see that N,/(nr) 
is an integer. Choosing a numbers from the set of equations (1) of §4, where 
* Since it is possible to choose a intervals from 8, in g,Cz ways, then for each choice it is possible 


to form g,Cq numbers similar to the number which is given in the braces. The symbol 
BsCa 


represents the symbolic sum ( v ) of these numbers. 

t Copeland, Admissible numbers in the theory of probability, American Journal of Mathematics, 
vol. 50, Oct., 1928, Theorem 16, p. 550. 

t See von Mises, Vorlesungen aus dem Gebiete der angewandten' M. athematik, pp. 147-48. 

§ See (3) and (4) of this section given below. 


& 
| 
| 
| 
| 
| 
| 
| 
] 
] 
| 
| 


= 
a 
3 


1934] TIME SERIES WITH CONSTANT PROBABILITY 525 


p-2*-* has been replaced by 8,, we may form the symbolic product (-) of 
these numbers and the negations of the remaining (8,—a) numbers. This 
product (-) is 


@ (£)x. = (Tix Ti a x 


i=] Bs t=a+1 t=a+l1 


(Tix. Il (a — 
t=1 


t=a+1 


We now raise the question when will 


(b) Ila - x) 
t=a+1 

be one? In order that this product be one, every X,‘% (i=1,2, - - - , a) must 
be one and every X,‘% ({=a+1,a+2, - - - ,8,) must be zero. The only time 
this will be the case, is when there is a point in each of the a intervals, ¢;/8, 
(t=1, 2,---, a), and not a point in the (8,—a) intervals, g;/8, (¢=a+1, 
a+2,---,8,), for the first period of the s-stage; that is, of the period I,/-)41. 
There are s,C, such numbers as (a) which can be formed from (1) of §4, but 
if (a) has for its first digit one, the remaining (s,C,.—1) numbers have for their 
digit zero and hence U, will have its first digit one. Hence the [(v,/r) +1]th 
digit of x, is the same as the first digit of U,. This process may be continued 
for the N,/r digits of U,. 

We will now find the function f(s) which has been referred to heretofore 
and the monotonic non-increasing sequence e, approaching zero. We will show 
that n(p, k, o, a, s)<e,/3 for every k, p, a and o for which k, a, o<f(s) and 
0<p<2*-', where 


(1) — — = n(p, k, 0, @, 
If k, p, o and @ are fixed, then the 


lim n(p, k, o, a, s) = 0. 

For every a, there is a finite number of integers p, hence we may choose 
n(k, o, a, Ss) greater than the greatest of the numbers n(p, k, o, a, s). Then the 
n(k, a, a, S) sequence dominates the n(p, k, o, a, s) sequence, but it is not nec- 
essarily monotonic. Let ¢(k, o, a, s) equal the least upper bound of the se- 
quence 7(k, a, a, s), n(k, o, a, s+1),---, and now we have e(k, a, a, s) 
dominating n(k, 7, a, s), which is a monotonic zero sequence. If for a given 6, 
there exists a 2; such that 2541 >z; and e(k, o, a, 23) Se(1, 1, 1, 6), if 5, o 5, 
and a <6. Let f(s) =6 if 25<s<2s4:. Hence e(k, o, a, s) Se(1, 1, 1, f(s)) if R, 


| 
| 

2 
3 


526 FRANCIS REGAN {July 


g, a<f(s). Let €,/3 equal ¢(1, 1, 1, f(s)), from which it follows that 
a, Ss) <e,/3. Hence we conclude that 


(1’) | {6,Ca(1 } { (r*e-*)/a!} < «/3, 


if k, o, a<f(s) and 0<ps2*—, 

We are now in a position to define the function f-'(c). Let f-1(6) =2;5. Ac- 
cording to the definition of z;, it follows that f-'(6+1) >f-1(6) and since 
f-(1) 21, then f-(6) If <s <f-'(o +1), then f(s) =o and if f-(c) 
<s, then s2f-'(c) =o. Hence the function f-'(c) has been established. 

We shall now give the formal proof of Theorem 3; that is to say, we may 
now put it in form so that the fundamental lemma may be applied. From the 
fact that U, is an element of 


A 
we have 


(2) | = {5.Ca(1 — 


It follows from (1’) and (2) that we can select the two sets of integers 
Mi, and Ni, Ne, ---,N.,---, referred to above, such 
that 


T%e-7) és 
int a! 3 
if N >M,/(nr), where the sequence M, is chosen so that M,/(nr) is an integer, 
and 


if N=N,/(nr), where v»,=N,+No+ -- - +N,y-1. 

The inequalities (3) and (4) hold for every k, n, p, B., f-(c), o, s, f(s) anda 
such that k<n<f(s), 0<p<2”-, 8, Ss, oSf(s), a is less 
than or equal to the smaller of 8, or f(s), and for every set of numbers 
11, 72, +, Such that and if i¥7. 

When o <f-'(c) <s, we have shown that the digits (v,/r)+1 to v.4:/7 of 
x, are the same as the digits 1 to N,/r of U,. From the definition of x,, we 
know that the digits 1 to v-.,)/r of x, are determined from the intervals J; to 
Ix of the time axis, where K =v,-1,,/T. The digits 


Vs 
—-+1 to 
nT nT 


| 
| 
k 
1 


1934] TIME SERIES WITH CONSTANT PROBABILITY 


are the same as 


In applying the lemma, the numbers 


kon; re] M, 
(EST. 
a! nT 


NT nT i=1 

have taken the places of U,, p, H., Js, us, So and x respectively. Since the 

numbers WN, have been selected, the time series is determined so that x, is a 

member of the set A [(r%e-")/a!], for every p, o and a. 

8. We will now consider the period + where sub-intervals of the form 
2-*+1 are omitted. With only slight modifications of the theorem in the pre- 
ceding section, we can show that the laws of admissibility are satisfied for 
this case. The same type of time series will be used. 

We shall consider m intervals of the form 2-°+! which repeat every p inter- 
vals. These intervals are of the same length as those intervals which make up 
that part of the time axis at the o-stage. Let the intervals which are under 


consideration begin at p;-2-°+! (t=1, 2, - - - , m), where pn» +1Sp and 
The sub-intervals of J, will be defined as (k—1)7r+);-2-°+!<Ahs(k—1)r 
+(ps+1)2-+! (i=1, 2, - - -, m). We will define x,-* such that its kth digit 
is one if exactly a points lie in the m intervals of J;, and zero otherwise. Since 
the same type of time series is used here, then the series will be defined when 
the numbers J, are determined. Hence, 


TueEoreM 4. If the conditions (H:) of Theorem 1 hold, then the numbers 
Ni, No, ---,N.,- ~~ can be chosen for every r and r’ satisfying the conditions 
T= 
(H2) m-2-*+1; 
m, p and o are positive integers, where mp and 0<pX2*-, so that the corre- 
sponding number is an element of A [r’*e-*’)/a’ 
Let p, 7, 7’, B., m and @ be defined such that 0<p<2¥-!, r=p-2-e+1, 
7’ =m-2-*+!, B,=m-2'-", mSp and aS§,. At the s-stage there are inter- 
vals in the m intervals of J,. Let us consider an event which succeeds if there 


* The symbol x,’ is used to denote the number x(a, 7’, ¢, A), where r’=m-2-°+1, t=0 and 


527 

N, ken; 
1 to — of Il *y, |. 
i NT i—1 

t 
i | 
| 
: | 


528 FRANCIS REGAN (July 


is a point in each of a such intervals and no points in the remaining (8,—ca) 
intervals. The number corresponding to this situation is 


2s 


where q;x is of the form 2°-*-p;+ and gj, is of the form 2*-*-p;-+-n, and if 
i=j, then k¥n, and if k=n, theni¥j. The numbers i and j take values from 
1 to m inclusive and the numbers & and u take values 1 to 2*~ inclusive. 
This number W, is admissible and is a member of the set 
A [9,Ca(1 — | 


When s=f-(¢) =o and <k Sv,4:/7, then there exist exactly points in 
the m intervals of J; if and only if the digit (k—v,/7) of W, is one. Hence, 
when s2f~'(c) 2a, the digits (v,/r) +1 to v.4:/7 of are the same as the 
digits 1 to N,/r of W,. The numbers N, are chosen so that v,/(mr) is an in- 
teger, when n<f(s), o<f—(c) Ss and 0<p<2*-, from which it follows that 
N./(nr) is an integer. 

As in Theorem 3, we can find a function f(s) and a monotonic non-increas- 
ing sequence ¢, whose limit is zero, such that 


(a) | — — < 44/3, 


where k, a, o<f(s), m<p and 0<p<2*-!, 
Since W, is a member of the set 


A [p,Ca(1 — 
we know that 
k 
(b) II (=) W,: = {,Ca(1 18s) 


It follows from (a) and (b) that we can select two sets of integers Mi, 
and Ny, No, ---,N.,-++ such that 


when N=M,/(nr), where the sequence M, has been chosen so that M,/(nr) 
is an integer, and 


@ 


if N=N,/(nr), where »,=Ni+No+ --- +N,-1. 


f 

a 

4 

| 


1934] TIME SERIES WITH CONSTANT PROBABILITY 529 


It is understood that conditions (c) and (d) hold for every i, pz, - - - , Pm, 
p, n, m, k, o, s, f-(c), Bs and @ such that kSn<f(s), 0<p<2¥-!, mSp, 
o<f(s), pmt+1Sp, oXf-(c) <s, B,=m-2*~, is less than or equal to the 
smaller of 8, or f(s), and for every set of numbers, ri, 72, - - - , 7x, Such that 
and 7;¥7; if i=). 

It follows from the definition of x, that the digits v,/(mr) +1 to v.4:/(nr7) 
of ] are the same digits as 1 to V,/(nr) of J. 

Hence, in applying the lemma to this theorem, we see that the numbers 
N, have been selected so that x, is an element of the set A [(r’*e-’) /a!], for 
every m, p, o and a, where m<p. 

If m=p, then p;=0 and this theorem becomes Theorem 3. Furthermore, 
if pi=r and pn=r+m-—1, the number x, becomes x(a, 7, ¢, A), where 
t=r-2-°+!, =m-2-°t! and A=p-2-*t!, 

9. Since the sequence of numbers J, has been found, we have constructed 
a time series which is consistent with the frequency theory of probability. 

It may be noted that several theorems proved in Professor A. H. Cope- 
land’s work (American Journal of Mathematics, vols. 50, 51, and 53), can 
be proved by applying the fundamental lemma in this paper. 


St. Lours UNIVERSITY, 
St. Louts, Mo. 


| 
te 
f 
4 
4 
Mi 


INSCRIBED SEQUENCES OF SURFACES ASSOCIATED 
WITH GENERALIZED SEQUENCES OF LAPLACE* 


BY 
G. D. GORE 


1. INTRODUCTION 


The theory of conjugate nets of curves on surfaces in a projective space of 
n dimensions was generalized by Bompianif in his theory of systems of curves 
in conjugacy of type v. Bompiani also generalized the theory of families of 
asymptotic curves on surfaces by a theory of families of curves in autocon- 
jugacy of type v. 

A set of transformations for surfaces bearing systems of curves in con- 
jugacy of type v was offered by B. Segref, who also gave a system of trans- 
formations for surfaces bearing families of curves in autoconjugacy of type 
v(v>1). These transformations produce sequences of surfaces quite analogous 
to the classical sequences of Laplace. In fact the sequences of surfaces bearing 
systems of curves in conjugacy of type v are generalizations of the classical 
sequences of Laplace. It is these generalized sequences to which we refer in 
the title. : 

It is the purpose of this paper to point out a large class of sequences 
associated with any given sequence of surfaces, and to examine the trans- 
formations which generate certain special sequences in that class. Several of 
the special sequences which we study in that class are associated with the 
above mentioned sequences of Segre. 

In §2, we state the geometric basis for a class of sequences called associ- 
ated sequences, and define generalized inscribed sequences. These generalized 
inscribed sequences form a subclass of the above associated sequences. A gen- 
eralized inscribed sequence is generated by the same kind of transformation 
as generates the sequence in which it is inscribed. The existence of a large class 
of the general inscribed sequences is established in §3, and a web of inscribed 
sequences is defined. The existence theorem of §3 is applied in §4 where a 
study is made of two classes of sequences of surfaces which are inscribed in 


* Presented to the Society, April 7, 1934; received by the editors December 15, 1933. 

t E. Bompiani, Sistemi coniugati sulle superficie degli iperspazi, Rendiconti del Circolo Mate- 
matico di Palermo, vol. 46 (1922), p. 91. 

1 B. Segre, Les systémes conjugués et autoconjugués d’espéce v et leur transformation de Laplace, 
Annales Scientifiques de l’Ecole Normale Supérieure, (3), vol. 44 (1927), pp. 153-212. 


530 


na 


i 


| 
| 
| 
| 
5 
4 


SEQUENCES OF SURFACES 531 


a given hyperbolic sequence of Segre. Two additional types of sequences, in- 
scribed in a given parabolic sequence of Segre, are considered in §5. 

As a result of this investigation, the transformations of Laplace and Segre 
are made available for a much less restricted class of surfaces than the class 
to which they have been applied heretofore. 

In the analytical considerations which follow, a point of a surface in a 
projective space of m dimensions is represented by +1 coordinates x,, de- 
noted by the single symbol x. The x; are functions of the curvilinear coordi- 
nates « and v, and they have as many partial derivatives with respect to u 
and v as are needed. Partial derivatives are denoted in accordance with the 
formula 

Qitix 
= xii 
(1.1) xii, 


2. INSCRIBED SEQUENCES IN GENERAL 


Consider any sequence T of surfaces 


which is generated by repeated application of a definite transformation such 
that any surface 24, is a transform of the surface 2; by means of a one- 
to-one point correspondence. The surfaces of T have in common all of the 
properties of the initial surface 2 which are invariant under the generating 
transformation. In this sense, they will be called mathematically equivalent 
surfaces. Let w, be any osculant of the surface 2, at a general point P,, or 
let w, be any osculant at P, of a curve belonging to a family which lies on the 
surface =,. In either case, let w,,: be the corresponding osculant pertaining 
to the point P,,; of the surface 2,,,. Associated with the ©? points of 2,, 
there are 7 osculants w,, and we shall denote the doubly infinite set by Q,. 
Obviously, there is associated with the sequence of surfaces 2; a sequence of 
sets of osculants 

Consider a surface =/, whose points are in a one-to-one correspondence 
with the 7 osculants w,, pertaining to the surface 2,, in such a manner 
that each point of the surface 2, is in united position with its corresponding 
osculant w,. Let 2,4; be a similarly described surface whose points are in a 
one-to-one united correspondence with the osculants w,+, of the set 2,41. The 
surface 2,',; is a transform of the surface Z/, by means of the indirect corre- 
spondence which connects them. The surfaces ---, 2/4:,--- form a 
sequence 7” of surfaces which will be referred to as an associated sequence of 
the sequence 7. 


| 
| 

i 
4 
d 


532 G. D. GORE [July 


DEFINITION 2.1. If the points of a surface are in a one-to-one correspondence 
with a set of ©* linear spaces of v dimensions, in such a manner that each point 
of the surface is in united position with its corresponding linear space, then and 
only then will the surface be said to be transversal to the set of linear spaces. 

DEFINITION 2.2. Let T denote a sequence of surfaces in which the points of 
each surface =; are joined in a one-to-one manner to the corresponding points of 
the adjacent surface 2:41 by a set Q of ©? linear spaces w of v dimensions, where 
the spaces w are osculating spaces at points of the surface =; to the curves of a 
family, which lies on the surface X;, or the spaces w are osculants of the surface 
>; itself. Let T’ denote a sequence in which consecutive surfaces are connected 
in the same manner as those of the sequence T. Let the sequence T’ be related to 
the sequence T so that each surface =} of the sequence T’ is transversal to the 
set Q; of osculants which connect the points of the surfaces 2; and i+: of T. 
Under these conditions the sequence T’ will be said to be inscribed in the sequence 
T. The sequence T will be said to circumscribe the sequence T’. 

The surface 2/ of the above inscribed sequence T’ may belong to a more 
general class of surfaces than does the surface 2; of the sequence T. For this 
reason, the point differential equations which represent the definition of the 
surface 2; will, in general, be of a higher order than the differential equations 
which represent the definition of the surface 2; of the sequence 7. However, 
the surfaces of the two sequences are connected by correspondences of the 
same kind. The analytical forms of the transformations in the two sequences 
will be the same. By proving the existence of these generalized inscribed se- 
quences associated with known sequences, we extend the application of 
known transformations to a more general class of surfaces and to their point 
differential equations. 

It is observed that a classical inscribed sequence of Laplace furnishes a 
special example under Definition 2.2. The following article will establish the 
existence of inscribed sequences of great generality. 

3. A WEB OF INSCRIBED SEQUENCES 

Before defining a web of inscribed sequences, we shall establish the fol- 
lowing basic 

THEOREM 3.1. Let T denote a sequence of surfaces in which the points of each 
surface Yi: are joined in a one-to-one manner to the corresponding points of the 
preceding surface =;, by a set Q; of ©* osculating spaces of v dimensions belong- 
ing to the curves of a family on the surface X;. Let =; be any surface which is 
transversal to the set Q, of osculating spaces w,. Then it follows that the trans- 
versal surface =; belongs to a sequence T’ of surfaces which is inscribed in the 
given sequence T. 


¢ 


1934] SEQUENCES OF SURFACES 533 


Consider a surface 2, and on it a family F of curves. Let \ denote the 
curve of the family which passes through the generating point P of the sur- 
face 2. Denote by w the osculating space of v dimensions at the point P to 
the curve A. Let 2’ and 2’’ denote two surfaces which are transversal to the 
set 2 of ©” osculating spaces w pertaining to the ©? points of the curves in 
the family F. Let P’ and P”’ denote the points of intersection of the surfaces 
>’ and =”’ respectively with the osculating space w. As the point P moves 
along the curve \ on the surface 2, the points P’ and P’’ generate two curves, 
d’ and X” respectively, on the surfaces 2’ and 2’’. Denote by w’ and w”’ the 
osculating spaces of vy dimensions to the curves \’ and ”’ at the respective 
points P’ and P’’. We shall show that the osculants w’ and w’’ intersect in a 
point. 

Let x(u, v) be the coordinates of the generating point of the above surface 
2. Let the curves of the above family F be chosen as the parametric u-curves. 
Choose any other family of curves as the v-curves. The osculating space of 
v dimensions w to the u-curve at the point P is determined by »+1 points, 
whose coordinates are x and the first v derivatives of x with respect to wu. 
Since the generating point P’ of the surface 2’ is in contact with the osculant 
w, the coordinates y of the point P’ can be expressed as 


(3.1) y = 
t=0 


For a similar reason, the coordinates z of the generating point P’’ of the sur- 
face are 


(3.2) g= 
i=0 


The osculating spaces of vy dimensions w’ and w’’ to the u-curves at P’ and P’’ 
are determined by two sets of points, whose coordinates are y and the first v 
derivatives of y, and z with the first » derivatives of z with respect to u. We 
exhibit these as follows: 


40 (#0) 70 
(3.3) y x yp), 


j=0 
itv 
(3.4) 
j=0 
The 2v+2 coordinates y* and z‘, on the left of equations (3.3) and (3.4), are 
expressed linearly in terms of the 2v+1 functions x’*. Hence there exists a 
linear relation among the z‘° and the y**. That relation will be indicated as 


q 
i 
+ 


534 G. D. GORE 


(3.5) DX = 

j=0 i=0 
Equation (3.5) indicates that the osculating space w’ intersects the osculating 
space w’’ in a point P/ , the coordinates of which are given by either the right 
or left member of (3.5). These coordinates are denoted as 


(3.6) y= 
j=0 
The surface generated by the point P/ will be denoted by 2. 
The above facts justify 


Lema 3.1. Let w be the osculating space of v dimensions at the point P to a 
curve d of a family of curves on a surface =. Let =’ and =’ be two surfaces which 
are transversal to the set 2 of ©? osculants w, pertaining to the ~* points of the 
surface Let \’ and be the curves on and respectively which corre- 
spond to the curve \ on >. Then it follows that the osculating space of v dimen- 
sions w’ to the curve X’ at a point P’ of the surface =’, and the osculating space 
of v dimensions w"’ to the curve d'' at the point P"’ of the surface ='’, intersect 
in a point P{ which generates a surface 2; . 


From this lemma, Theorem 3.1 can be obtained directly by assuming that 
the surface 2’’ of the lemma is a transform ~, of the surface 2, and that the 
surfaces = and 2, belong to a sequence of mathematically equivalent surfaces 


Since the surface 2 bears the same relation to the assumed surface 2, as the 
surface D’ bears to the surface 2, and since the surfaces = and 2, are mathe- 
matically equivalent, it follows that the surfaces =’ and 2, are also math- 
ematically equivalent. That is, the surface =/ is a transform of the surface 
>’. Also, since the surface 2, is transformable into the surface 22 of the se- 
quence (3.7), by a repetition of the above argument, it follows that the sur- 
face >/ is transformable in the same manner into a surface 27 of the sequence 


, , , 
(3.8) 


If in equations (3.2), (3.5) and (3.6) we replace the coordinates z by x, of 
the point P; of the surface 2:1, we obtain the following relations: 


(3.9) 1 = Dd Bio 


i=0 


(3.10) > = X1"°, 


j=0 j=0 


[July 

4 
% 
At 
4 
. 
| 
a 
= 
v v 
: 


1934] SEQUENCES OF SURFACES 


(3.11) = D y*, 


j=0 

Equation (3.12) shows that the surface 2, of point coordinates , is trans- 
versal to the ©? osculating spaces of vy dimensions to the u-curves of the sur- 
face 2;. This fact indicates that the sequence (3.8) is inscribed in the sequence 
(3.7). Equation (3.11) shows that corresponding points of the two surfaces 
>’ and 2/, of the sequence (3.8), are joined in a one-to-one manner by the 
osculating spaces of v dimensions of the u-curves on the surface >’. These 
facts complete the proof of the theorem. 

Equation (3.11) shows that the transformation which generates the in- 
scribed sequence (3.8) is of the same analytical form as the transformation 
(3.9) which generates the circumscribed sequence (3.7). 

The inscribed sequence T’ of Theorem 3.1 has all of the properties of T 
which are required by the hypothesis. As a consequence, the theorem is ap- 
plicable to the sequence 7’, and repeatedly, showing that there is in general 
an endless aggregate of sequences of surfaces 


(3.13) 1,7", 
successively inscribed in a given sequence T. 


DEFINITION. An aggregate of successively inscribed sequences of the type 
(3.13) will be called a web of inscribed sequences. The sequence T will be said to 
be circumscribed about the web. 


In a given web, the properties of the surfaces vary from one sequence to 
the next, but the transformations have the same form for the entire web. 


4. SEQUENCES INSCRIBED IN A HYPERBOLIC SEQUENCE OF SEGRE 


A surface 2 bearing a system of curves in conjugacy of type v may be de- 
fined as an integral surface of a hyperbolic differential equation* 


1 
(4.1) AG; = 0. 
i=0 j=0 

Segre’s transformation of the First Kind} for the above surface = has the 
form 
(4.2) a1 = bio x*®, 

i=0 
* B. Segre, loc. cit., p. 161. 
t B. Segre, loc. cit., p. 169. 


v 
| 
‘Bi 
| 
| 
| 
q 
| 
3 
‘ 


536 G. D. GORE [July 


for which the coefficients bo are determined, to within a proportionality fac- 
tor, in terms of the A;,; of equation (4.1). It is evident from equation (4.2) 
that the surface 2, generated by the point having coordinates x, is trans- 
versal to the osculating spaces of v dimensions to the u-curves on the surface 
2. Corresponding points of the two surfaces = and 2; are joined in a one- 
to-one manner by the osculating spaces of »v dimensions of the u-curves of the 
surface 2. By repeated application of the transformation (4.2), a sequence 
T. of surfaces 


(4.3) 21, 2a, 


is generated. 

Since each surface of the sequence (4.3) is an integral surface of a hyper- 
bolic differential equation of the type (4.1), we shall refer to the sequence as 
a hyperbolic sequence of Segre. That part of the entire sequence which is gen- 
erated by the Segre transformation of the First Kind will be called the forward 
or positive branch. 

From the foregoing remarks, we verify that the positive branch 7, of the 
above hyperbolic sequence of Segre has all of the properties required by the 
hypothesis of Theorem 3.1. Consequently, the theorem is applicable to any 
surface which is transversal to the »* osculating spaces which connect cor- 
responding points of any pair of consecutive surfaces in the sequence of Segre. 
From this fact we have 


THEOREM 4.1. The positive or forward branch T. of a hyperbolic sequence of 
Segre is circumscribed about a web of inscribed sequences 


(4.4) Ty, 


We now consider a class of sequences of surfaces inscribed in the inverse 
or negative branch T_ of a hyperbolic sequence of Segre. The transformation 
which takes the above surface = into its transform 2_, of the negative or 
inverse branch of the sequence of Segre* is represented by the equation 


1 

i=0 j=0 
The x_, are the coordinates of the point P_; which generates the surface D_. 
The coefficients C;; are determined to within a proportionality factor, in 
terms of the coefficients A;; of equation (4.1). 

From the terms of equation (4.5) we observe that the point P_, lies in the 

sum-space of 2v—1 dimensions, formed by the osculating space of y—1 di- 
mensions to the w-curve at the point P of 2, and by the osculating space of 


* B. Segre, loc. cit., p. 183. 


Y 


1934] SEQUENCES OF SURFACES 537 


y—1 dimensions to the u-curve through the point (u, v+Av), adjacent to P. 
We shall denote this sum-space by a, and likewise the corresponding sum- 
spaces at the generating points of the surfaces 21, 2-2, - - - by the corre- 
sponding symbols o_;, g-2,---. 

Let >’ denote a surface which is distinct from the two surfaces 2 and 
2-1, but which is transversal to the * osculating spaces o of the surface 2. 
The coordinates y of the generating point P’ of the surface 2’ can be ex- 
pressed as 

v1 1 
(4.6) y= 

t=O j=0 
in which the g;; are arbitrary except that they are distinct from the C;; of 
equation (4.5). 

On examining the two osculating sum-spaces o_; and a’ of the surfaces 
>_, and 2’, we find, as will be shown analytically in the next paragraph, that 
they intersect in a point P’_;. The point P_; generates a surface =‘, which is 
transversal to the ©” osculating spaces o_;. It is also transversal to the 0? 
osculants o’ of the surface 2’. From the mathematical equivalence of the 
surfaces = and ¥_, and by the fact that the surface >; bears the same re- 
lation to the surface Z_, as the surface ~’ bears to 2, it follows that the two 
surfaces D’ and =/, are mathematically equivalent. The surface D1, is a trans- 
form of the surface =’, and corresponding points of the two surfaces are 
joined by the ~? osculants o’ of the u-curves on the surface 2’. These facts 
justify 

THEOREM 4.2. Let = and Y_, denote two consecutive surfaces of the inverse or 
negative branch T_, of a hyperbolic sequence of Segre. Let =’ be any surface 
which is distinct from the surfaces = and Y_,, and which is transversal to the ~* 
osculating sum-spaces o joining corresponding points of = and L_,. It follows 
that the surface =’ belongs to a sequence T'_; of surfaces which is inscribed in the 
branch T_, of the given sequence of Segre. 


For the analytical justification of the above theorem, we exhibit the co- 
ordinates of the points which determine the osculating sum-spaces a’ and 
o_, at the points P’ and P_, of the surfaces 2’ and Y_;. By computing deriva- 
tives of equations (4.5) and (4.6), we obtain the desired coordinates as the 
left members of 


9— 154 = 0,1), 
(4.7) 

Ou) 
Gis 


i=0 


a 
= 
4 
d 
i 
4 


538 G. D. GORE [July 


The left members of (4.7) are 4v functions expressed linearly in terms of the 
6v—3 functions xi (i=0, 1, ---, 2»—2; 7=0, 1, 2). By computing higher 
derivatives of equation (4.1), it is easily shown that there are 2v—2 linear 
relations among the above functions x‘. These relations are expressed by the 
equation 

vty 146 

=0 (y = 0,1,---,»— 2;8 = 0,1). 

i=0 j=0 
Hence the above mentioned 4y functions of the left members of (4.7) are 
ultimately expressed in terms of 4v—1 of the 6»—3 variables x‘. The left 
members of (4.5) therefore satisfy a linear relation of the form 


1 


i=0 j=0 i=0 j=0 


This equation indicates that the osculating sum-space o’ of the surface 
>’ intersects the osculating sum-space o_; of =_, in the point P’, the co- 
ordinates y_, of which are obtained from the left member of (4.6) as 


1 
(4.9) ya = 
i=0 j=0 
Equation (4.9) exhibits the analytical form of the transformation which 
generates the inscribed sequence 7‘, indicated by the above theorem. This 
transformation is essentially of the same form as the transformation (4.5) 
which generates the negative branch 7_, of the hyperbolic sequence of Segre. 


5. SEQUENCES INSCRIBED IN A PARABOLIC SEQUENCE OF SEGRE 


A surface = bearing a family of curves in autoconjugacy of type v may be 
defined as an integral surface of a parabolic differential equation of the type 
v+1 
(5.1) Aio x? + > Aax! = 0.* 
t=0 
The transformation of Segret, which takes the above surface = into the 
surface =, of the positive branch 7, of a sequence, can be represented by the 
equation 


v—1 


(5.2) = >> big x? (v > 1). 


i=0 


* B. Segre, loc. cit., p. 159. 
Tt B. Segre, loc. cit., p. 206. 


1934] SEQUENCES OF SURFACES 539 


The 5,9 are determined, to within a factor, in terms of the coefficients of equa- 
tion (5.1). 

A Segre sequence of surfaces bearing families of curves in autoconjugacy 
of type v will be referred to as a parabolic sequence of Segre. 

Equation (5.2) shows that in the positive branch of a parabolic sequence 
of Segre the corresponding points of two adjacent surfaces 2 and 2; are 
joined, in a one-to-one manner, by the ©? osculating spaces of y—1 dimen- 
sions to u-curves of the surface =. On replacing the index vy by y—1 in The- 
orem 3.1, we have the resulting 


THEOREM 5.1. The positive branch T, of a parabolic sequence of Segre is 
circumscribed about a web of inscribed sequences of surfaces. 


Attention will now be given to a class of sequences inscribed in the inverse 
or negative branch 7_ of a given parabolic sequence of Segre. 

The transformation which carries a given surface 2 of a parabolic se- 
quence of Segre* into its transform 2_, of the negative branch has the 
analytic form 


v v—2 
(5.3) = x? + Ci 


i=0 i=0 


in which the C;; are uniquely defined, except for a proportionality factor, in 


terms of the A,; of (5.1). 

Equation (5.3) shows that the point P_;, which generates the surface 2_1, 
is in the sum-space formed by the osculating space of v dimensions to the 
u-curve through the point P of = and by the osculating space of »—2 dimen- 
sions to the u-curve through the point (uw, »+Av) of 2. We shall denote this 
osculating sum-space by a, and shall denote the corresponding sum-space of 
the surface by etc. 

Let 2’ represent any surface, distinct from the surfaces = and 2_;, which 
is transversal to the osculating sum-spaces o of the surface =. On examining 
the osculating sum-spaces a’ and o_, of the surfaces =’ and 2_, we find, as 
will be demonstrated analytically later, that these two sum-spaces intersect 
in a point P’;. The point P’_, generates a surface =‘, which is transversal to 
the osculating sum-spaces o_, and o’ of the surfaces 2_; and 2’. The surface 
>‘, bears the same relation to the surface Y_,; as the surface 2’ bears to the 
surface =. Since the surfaces 2 and Z_; are mathematically equivalent, it 
follows that the surfaces 2’ and 2‘, are mathematically equivalent. The sur- 
face =‘, is a transform of the surface =’, and corresponding points of the 


* B. Segre, loc. cit., p. 209. 


540 G. D. GORE [July 


two surfaces are joined by the osculating sum-spaces a’ at points of the sur- 
face >’. From these facts we have 


THEOREM 5.2. Let = and Z_, be any two consecutive surfaces in the inverse 
or negative branch T_ of a parabolic sequence of Segre. Let d' be any third sur- 
face which is transversal to the ~* connecting sum-spaces o pertaining to the 
surface =. Then it follows that the surface =’ belongs to a branch T'; of a se- 
quence of surfaces inscribed in the given sequence of Segre. 

To justify the above theorem analytically we exhibit the coordinates of 


the points which determine the osculating sum-space o_, at the point P_, of 
the surface 2_,. By taking derivatives of the x1, as expressed in (5.3), we 


v+r 
(0) il 
= > Cw x 
t=0 i=0 


—2+r 


i=0 j=0 i=0 j=0 


In a similar manner the coordinates y of the point P’ and the remaining points 
which determine the sum-space a’ at the point P’ of the surface 2’ can be 
displayed in the form 


v+h v—2+h 


t=0 t=0 


(5.5) 
j=0 t=0 j=0 
The left members of (5.4) and (5.5) are 4v functions expressed linearly in 
terms of the 6y—3 functions x1, - 277-9; 491) 4-21; 
x, +--+, %%-4.2. But by means of equation (5.1) and its derivatives we have 
2v —2 linear relations among the above 6v —3 functions. We exhibit the 2v —2 
relations as follows: 


t=0 t=0 


(5.6) 


2d Bis > ve =0 3). 
7=0 j=1 

By means of the 2v—2 relations (5.6), the 4v left members of equations (5.4) 

and (5.5) are expressed linearly in terms of 4y—1 derivatives of «. Hence the 

left members of (5.4) and (5.5) satisfy a linear relation of the form 


obtain 
(5.4) 
= 


1934] SEQUENCES OF SURFACES 


i=0 t=0 


i=0 t=O 
This equation shows that the osculating sum-spaces a’ and o_; meet in a point 
P_, the coordinates y_; of which are given by the left members, which we 
exhibit as 


v v—2 
i=0 i=0 


The transformation (5.8), which sends the surface 2’ into the surface 
~11, is obviously of the same form as the transformation (5.3) which sends 
the surface = into the surface 21. 


CENTRAL Y.M.C.A. COLLEGE, 
Cuicaco, 


541 


THE GEOMETRY OF RIEMANNIAN SPACES* 


BY 
W. C. GRAUSTEIN 


The primary purpose of this paper is to expose, in as simple and clear a 
form as is possible, the fundamentals of the geometric structure of a Rie- 
mannian space. 

It is a general truth that the methods which pierce most deeply into the 
heart of a geometric theory are invariant methods, that is, methods which are 
independent of the choice of the coordinates in terms of which the theory is 
expressed analytically. In the case of Riemannian geometry, these are the 
methods of tensor analysis. 

As important, perhaps, as the use of invariant methods is the expression 
of the analytic theory, so far as possible, in terms of invariant quantities 
alone. For it is in this form that the theory becomes most illuminating and 
suggestive. But, in ordinary tensor analysis, the components of a tensor are 
not invariants. A first step toward our goal will be, then, to introduce for 
Riemannian geometry an intrinsic tensor analysis, that is, a form of tensor 
analysis in which the components of all tensors are invariants. 

Any theory of the geometry of a Riemannian space presupposes that the 
space is referred to a certain ennuple of congruences of curves. In the ordinary 
theory, this ennuple consists of the parametric curves. In the intrinsic theory, 
it is an ennuple £, whose choice, as will presently be evident, is entirely arbi- 
trary. 

The ordinary components of a tensor, that is, the components in the ordi- 
nary theory, are referred to the differentials of the coordinates x‘ pertaining 
to the ennuple of the parametric curves. The intrinsic components are re- 
ferred to the differentials of arc, ds‘, of the curves of the ennuple E. 

There is no need in the intrinsic theory of actual coordinates pertaining 
to the ennuple £; the differentials of arc ds‘ suffice. Accordingly, E can be 
chosen arbitrarily; it does not have to be a parametric ennuple, that is, an 
ennuple with which coordinates can be associated. It may be, and we shall 
ordinarily take it to be, an ennuple of general type, and hence, of course, 
not necessarily orthogonal. 

Intrinsic components, referred to E, of the covariant derivative of a tensor 
are made possible by the introduction of invariant Christoffel symbols in 


* An invited paper, presented to the Society, June 21, 1933, under a different title; received by 
the editors February 12, 1934. 


542 


4 
a 


THE GEOMETRY OF RIEMANNIAN SPACES 543 


place of the ordinary ones and the use of directional differentiation along the 
curves of £ instead of partial differentiation with respect to the coordi- 
nates 

It is perhaps well to emphasize the fact that we are not introducing a 
new tensor analysis or a new covariant differentiation, but simply new com- 
ponents for the usual tensors and their customary covariant derivatives.* 

The fact that second directional derivatives are not, in general, independ- 
ent of the order of differentiation has two consequences. On the one hand, it 
necessitates for directional differentiation conditions of integrability involv- 
ing a set of invariants, B;*, depending on three indices. On the other hand, 
it implies that the invariant Christoffel symbols, for example, those of the 
second kind, C;*, are not symmetric in 7 and 7. Actually, it turns out that 
OFF —C;# = 

In a previous paper,} the author discussed and compared two concepts 
bearing on two families of curves on a two-dimensional surface, namely, the 
concept of distantial spread, a measure of the deviation from equidistance 
of one of the families of curves with respect to the other, and the concept of 
angular spread, or associate curvature, a measure of the deviation from paral- 
lelism, in the sense of Levi-Civita, of the one family with respect to the other. 
These concepts, when generalized so as to apply to two congruences of curves 
in Riemannian space, give rise to a “distantial spread vector” of the two 


congruences, taken in a given order, and an “angular spread vector” of each 


* The intrinsic absolute calculus which we employ may be described, with reference to the litera- 
ture, from two different points of view. In the first place, it is a generalization to the case of an arbi- 
trary ennuple of the intrinsic absolute calculus with respect to an orthogonal ennuple which gradually 
grew out of Ricci’s theory of an orthogonal ennuple. The invariants associated with a tensor with 
respect to the orthogonal ennuple came to be known as the orthogonal coordinates of the tensor with 
respect to the ennuple and corresponding components of the covariant derivative of a tensor, based 
on Ricci’s coefficients of rotation, were eventually introduced. The first complete account of this 
intrinsic absolute calculus with respect to an orthogonal ennuple seems to be in Berwald, Differential- 
invarianten in der Geometrie, Riemannsche Mannigfaltigkeiten und ihre V erallgemeinerungen, Ency- 
klopiidie der Mathematischen Wissenschaften, III, D, 11 (1923), pp. 141-143. 

From another point of view, our intrinsic absolute calculus may be described as the result of 
employing the arcs of the curves of the arbitrary ennuple E as so-called nonholonomic parameters. 
Though he does not use the term, G. Hessenberg, in his Vektorielle Begriindung der Differential- 
geometrie, Mathematische Annalen, vol. 78 (1917), pp. 187-217, appears to be the first to employ 
the method of nonholonomic parameters. The Pfaffians in his theory are not differentials of arc, 
whereas it is essential for our purpose that they should be. Of recent years, nonholonomic parameters 
have been used by Cartan, Schouten, Vranceanu, Hordk, and others, particularly in the study of 
affine and more general connections and of nonholonomic manifolds. For the formal results in the 
case of a general linear connection, see Z. Horak, Die Formeln fiir allgemeine lineare Ubertragung bei 
Benutzung von nichtholonomen Parametern, Nieuw Archief, vol. 15 (1928), pp. 193-201. 

¢ Graustein, Parallelism and equidistance in classical differential geometry, these Transactions, 
vol. 34 (1932), pp. 557-593. 


4 
4 
* 
4 


544 W. C. GRAUSTEIN [July 


congruence with respect to the other. The latter vector is the same as the 
associate curvature vector of Bianchi and becomes, when the two congruences 
are identical, the curvature vector of the single congruence. 

The intrinsic contravariant components of the distantial spread vector of 
the ith and jth congruences of the ennuple E are precisely B;*, and those of 
the angular spread vector of the 7th congruence with respect to the jth are 
C;#*. Thus, the intrinsic Christoffel symbols and the invariants B;* have geo- 
metric meanings of the first order of importance. 

The results thus far described are given in §§1—5. In §6 are to be found 
applications of distantial spread vectors to questions of equidistance and to 
the problem of the inclusion of r linearly independent congruences of curves 
in a family of r-dimensional surfaces. Thereby further geometrical interpreta- 
tions of the invariants B;* are obtained. 

In §7 special types of ennuples are discussed: parametric ennuples; par- 
ticular parametric ennuples, designated as ennuples of Tchebycheff and char- 
acterized by the fact that the differentials of arc ds‘ are all exact; and, finally, 
Cartesian ennuples, that is, Tchebycheff ennuples whose angles are all con- 
stant. The next section treats of ennuples Cartesian at a point and their 
relationship to coordinates geodesic at a point. 

A digression is made in §10 to apply the methods previously developed 
to spaces with genéral metric connections. Geometric interpretations of the 
intrinsic components of the tensor of torsion are found, in terms of torsion 
vectors closely allied to the torsion vector of Cartan, and relations between 
these torsion vectors and the distantial and angular spread vectors are estab- 
lished. Application of the results is made to spaces admitting absolute paral- 
lelism. 

In §11 the transformation from one ennuple of congruences to a second 
is discussed. Of course, the transformation from the intrinsic components of 
a tensor, referred to the one ennuple, to those of the same tensor, referred to 
the other ennuple, is found to obey the formal laws of tensor analysis. More- 
over, the relations between the intrinsic Christoffel symbols for the two en- 
nuples are patterned precisely after the equations of Christoffel. But these 
Christoffel symbols may be interpreted in terms of the curvature and asso- 
ciate curvature vectors of the congruences of the two ennuples, as already 
noted. Thus, Christoffel’s equations, expressed in terms of invariants, are 
simply a generalization to Riemannian geometry of the fundamental formula 
of Liouville for geodesic curvatures on a two-dimensional surface. 

In §12 the general problem of the determination of the family of surfaces 
of lowest dimensionality in which lie all the congruences of an arbitrarily 
chosen set of congruences of curves is discussed. The problem is, of course, 


i 
* 
a 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 545 


identical with that of the determination of the maximum number of func- 
tionally independent integrals of a system of linear homogeneous partial dif- 
ferential equations of the first order. It is considered here from a geometric 
point of view and is shown to depend, for its solution, on the consideration 
of a sequence of sets of vectors such that the vectors of each set are distantial 
spread vectors of congruences determined by the vectors of the preceding 
sets. Application of the results is made to nonholonomic manifolds. 

1. Oblique ennuple of congruences. Let there be given in a Riemannian 
space V,, referred to coordinates (x1, x”, - - - , x"), an arbitrarily chosen en- 
nuple, £, consisting of m ordered linearly independent congruences of directed 
curves, and let the contravariant components of the field-of unit vectors tan- 
gent to the curves C; of the ith congruence, and directed in the same senses 
as these curves, be d;/,7=1,2,---,m. 

Suppose that d'; is the cofactor of @,/ in the determinant | 4,‘|, divided 
by the determinant: d, ia‘; = bi, Then for i fixed and j=1, 
2,- ++, , are the covariant components of a vector-field, or, more simply, 
a vector, which is perpendicular to the tangent vectors of all m congruences 
except that of the curves C;. The vectors thus determined are known as the 
vectors conjugate (or reciprocal) to the unit vectors tangent to the curves 
of E. 

If 9/ds* denotes directional differentiation in the positive direction of an 
arbitrary curve C;, 


(1a) 


Suppose that we write also, purely as a matter of notation, 
(1b) 


Then the relations between 4,/ and a‘; become 
Oxi dst 
axi 
The first set of these equations says that the Pfaffian d‘;dx/ has the value 
zero for every curve of E except a curve C; and for a curve C; is equal to 
the differential of arc of the curve, measured in the positive direction along it. 


Thus, the relations between the differentials of arc ds‘ of the curves C; and the 
differentials dx‘ are 


(2) (i,k = 1,2,---,m). 


(3) ds* = —dx', — ds? 
Ox? ds? 


= ai=—. 
G = 1, 


546 W. C. GRAUSTEIN [July 


The relations between the directional derivatives 0/ds‘ and the partial 
derivatives 0/dx‘ are obviously 
af  axi af af asi af 


(4) —=——, 


Ost Axi Ox* asi 
From (2), (3), and (4) it follows that 


-,n). 


(5) df = 


Conditions of integrability. The fundamental relations 
| : ( 1, 2 
Ox? Ox” Ax? 
when expressed in terms of directional derivatives, take the form 
(6) 
where 


(7a) 


or 


Ox" \ ds? ds ast asi] 


The expression in (7b) follows from that in (7a) by virtue of the relations 
obtained by directional differentiation of the first set of equations (2). 


THEOREM 1. A necessary and sufficient condition that f;ds‘, where f;= 

fila}, x?, - ++, x"), be an exact differential is that 
Of; Of; 
(8 

The theorem follows directly from (6) inasmuch as, according to (5), 
fids‘ is an exact differential if and only if there exists a function f such that 

2. Intrinsic tensor analysis. The geometric basis, or system of reference, 
for ordinary tensor analysis is the system of parametric hypersurfaces x‘=c,, 
i=1, 2,---,m, or the corresponding ennuple of congruences of parametric 
curves. This ennuple is evidently of very special type. 


- 
4 
n), 
Os? Os* Os* Os? 
= q 
a 
4 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 547 


As the system of reference for our intrinsic tensor analysis, we take the 
arbitrarily chosen ennuple E of the preceding section. 

In the ordinary theory, the basic differentials are the differentials dx‘ and 
the basic derivatives are the partial derivatives 0/dx‘. In the intrinsic theory 
it is the differentials of arc, ds‘, and the directional derivatives, 0/ds‘, which 
are fundamental. 

Whereas the ordinary components of a tensor, that is, the components 
in the ordinary theory, are referred to dx‘ and 0/dx‘, the intrinsic components 
are to be referred to ds‘ and 0/ds‘. For example, if Bj are the ordinary com- 
ponents of a tensor of the fourth order, that is, if bjjdx‘6xi(df/ds*)(8¢/ds'), 
where f and ¢ are invariant functions, is an invariant, the intrinsic com- 
ponents, bi, of the tensor are to be such that bids‘6si(Af/ds*) (0¢/ds') is the 
new form of this invariant: 


(9) 


To obtain the transformation from the ordinary to the intrinsic com- 
ponents of a tensor, we should substitute for dx‘, dx‘, - - - , 0f/Ox‘, db/dx', - - - 
in an equation such as (9) their values in terms of ds‘, ds‘, -+-, df/ds*, 
d¢/ds‘, -- +, as given by (3) and (4). But equations such as (9) and the 
transformations (3) and (4) have the same form as the analogous equations 
and transformations associated with a change from the coordinates x‘ to new 
coordinates y*. Hence, the transformation from the ordinary to the intrinsic com- 
ponents of a tensor obeys the standard formal laws of tensor analysis. Tf, in the 
transformation of the components of a tensor which is the result of a change 
from the coordinates x‘ to coordinates y‘, y‘ is replaced by s‘, the transforma- 
tion becomes that from the ordinary components to the intrinsic components. 

Thus, if 2;; and 2‘ are the ordinary covariant and contravariant com- 
ponents, and g;; and g‘/ the corresponding intrinsic components, of the funda- 
mental tensor, we have 


(10a) = 


(10b) 


The invariant form of the linear element, ds? = 2;; dx‘ dx’, is 
(11) ds* = g;;ds‘ds’. 


Since 2;; and Z* are symmetric, so also are g;; and g*. 
Again, the relations between the ordinary contravariant and covariant 


4 dx* dx! ds* as! 
dx! ds* ds! 
3 “ast asi Ox* Ox! 
‘i Os* Asi Ox* Axi 
4 


548 W. C. GRAUSTEIN [July 


components, d‘ and d;, of a vector and the corresponding intrinsic com- 
ponents, a‘ and aj, are 
_ Ost 


12a i= gi—, 
Oxi 


(12b) 


Inasmuch as 0x//ds‘ and ds‘/dx’, for i fixed and j=1, 2, - - - , m, are re- 
spectively the contravariant and covariant components of vectors, namely, 
the ith tangent and the ith conjugate vector associated with E, it follows 
from equations such as the first sets in (10) and (12) that the intrinsic com- 
ponents of a tensor are actually invariants.* 

Components of the vectors pertaining to E. If a, and a); are the ordinary 
contravariant and covariant components, and ay! and an); the corresponding 
intrinsic components, of the field of unit vectors tangent to the curves C, 
of the hth congruence of E, we have 

Osi 


= 


13a 9 
Os* Ox* 


(13b) ‘= 5%, = 


Formulas (13b) follow from (13a) by means of (12) and (2), and the second 
equation in (13a) follows from the first by virtue of (10a) and (2). 

Denoting the ordinary covariant and contravariant components of the Ath 
conjugate vector-field by é"|; and 4*|, and the corresponding intrinsic com- 
ponents by a*| ; and a*| ‘, we havet 
(14a) 

ds! 


(14b) = = gh, 
Geometric inter pretations. The first of the formulas (10a) says that 
(15) = 1, = COS (i,j = 1, 2,---,m), 


where wi; is the anglet at P:(x!, x, - - - , x") between the directed curves C; and 
C; which pass through P. 


* From the usual point of view, the first sets of equations in (10) and (12) define invariants per- 
taining to the given tensors with respect to Z, and the second sets express the ordinary components 
of the tensors in terms of these invariants and the components of the vectors pertaining to E. See the 
long footnote in the introduction and, for example, Eisenhart, Riemannian Geometry, p. 97. 

t It is to be noted, from (13a), (14a), and (11), that 4;|/=4,/ and a'| ;=a¥;. 

t The angle ¢ between two vectors at a point shall be restricted to lie in the intervalOS¢<z. 


ds? 

_ Oxi ds? 

| a; = 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 


Inasmuch as 


La" |, = 5} = 1, 


we conclude that the length of the conjugate vector a*| at the point P:(x*) is 
equal to sec Oy, where 0, is the angle at P between the vector a*| and the tangent 
vector a,| at P. It follows that 0 <0, <7/2. 

The geometric meaning of the first of the formulas (10b) is now clear: 


(16) gt = sec? gi? = cos sec 0; sec 0; (i,7 = 1, 2,---, m), 


where Q; is the angle at P between the conjugate vectors a‘| and a‘|. 

We now introduce in the flat space, S,, tangent to V, at P, the Cartesian 
coordinate system with respect to which the intrinsic contravariant com- 
ponents X‘ of an arbitrary vector V at P are the coordinates (X1, X*,-- -, 
X*) of the “terminal point” Q of V. The axes of this system, which we shall 
call the intrinsic contravariant coordinate system at P, are the directed tan- 
gents to the curves of E which pass through P. Furthermore, inasmuch as 
the tangent vector a,|'=6{ is of unit length, the unit of measure on each 
axis, relative to measurement in V,, is actually unity. 

We also introduce in S, the intrinsic covariant coordinate system, with 
respect to which the intrinsic covariant components X; of the vector V are 
the coordinates (Xi, Xe, ---, Xn) of the point Q. The axes of this system 
coincide in direction and sense with the conjugate vectors at P. The unit of 
measurement on the hth axis is not unity, but sec @,; for the Ath conjugate 
vector a"); = 6} is of length sec 6,. 


THEOREM 2. The intrinsic covariant (contravariant) components of a vector 
V at a point P are, on the one hand, the orthogonal projections of V on the axes 
of the intrinsic contravariant (covariant) system of coordinates at P, and, on the 
other hand, the parallel projections of V on the axes of the covariant (contra- 
variant) system of coordinates at P. 


The second part of the theorem amounts to the previous identification 
of the components of V as Cartesian coordinates. The first part follows from 
the relations =X), a*|,Xé =X". In particular, we have in an|; = gn; and 
a*|é = g"* new interpretations of ga; and g". 

Case of an orthogonal ennuple. If each two congruences of £ cut at right 
angles, the tangent vectors a,| form an orthogonal ennuple of vectors, the 
conjugate vectors a"| become unit vectors coincident with the corresponding 
tangent vectors, the intrinsic contravariant and covariant coordinate sys- 
tems of Theorem 2 coincide in a rectangular system, parallel and orthogonal 
projections on the axes of this system are identical, and the intrinsic contra- 


549 
4 
4 


550 W. C. GRAUSTEIN [July 


variant and covariant components of a vector are the same. Furthermore, in- 
asmuch as now gi;=g**=1, gi;= gi =0, 747, any two tensors which are asso- 
ciate to one another in that they are obtainable from one another by raising 
subscripts or lowering superscripts by means of the fundamental tensor, have 
the same components. 

3. Intrinsic covariant differentiation. We now introduce, for the ennuple 
E, invariant Christoffel symbols, C;;, and C;*, to take the place of the ordi- 
nary Christoffel symbols, 


k 
Cin = Tut = 4 
Inasmuch as we shall assume that Cij,=gn.Ci; , it suffices to define C;;*. 
We first write the formula for the transformation of C;* induced by a change 
from the coordinates x‘ to coordinates y‘. In this formula we replace the first 
partial derivatives of x‘ with respect to y‘ by the corresponding directional 
derivatives and the single second partial derivative by a specific one of the 
two corresponding directional derivatives. Thus we get 


Ox? Ox? 


asi as! 


(17) Cis = Coe! 


Since second directional derivatives are not, in general, independent of 
the order of differentiation, C;* is not, in general, symmetric in 7 and j. In 
fact, we have, from (17) and (7a), that 


(18) C;;* Cx* = B;;*. 


Intrinsic covariant differentiation. Formula (17) enables us to find the in- 
trinsic components of the covariant derivative of any tensor in terms of the 
intrinsic components of the tensor. 

Consider, for example, the vector of equations (12). The ordinary com- 
ponents of the covariant derivative of this vector are 


(19) 


The intrinsic components, which we shall denote by a;,; and a‘ ; respectively, 
are expressible in terms of the ordinary components, according to the defini- 
tions of §2, by the formulas 

dx! 


(20) 


When the values of @;,; and a‘ ;, rewritten in terms of directional derivatives 


mx! 

Ox? Ox? 

Ox* 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 551 


and the intrinsic components a; and a‘ of the vector, are substituted in (20), 
these formulas take on, by virtue of (17), the following desired forms*: 
(21) C;;" WC, 4 

asi j= asi nj 

In comparing (21) with (19), it must be borne in mind that C;7, unlike 
C;#, is not symmetric in the two subscripts. In (21) and, in fact, in all similar 
formulas for the intrinsic components of covariant derivatives of tensors, it 
will be found that the second subscript on the C indicates the component of 
the covariant derivative in question.} In the corresponding formulas for the 
ordinary components of covariant derivatives, the second subscript on the C 
may be, and usually is, made to play the same role. The two sets of formulas 
have, then, the same forms. 

It is evident that, if the ordinary components of a tensor are all zero, so 
also are the intrinsic components. Thus, since 2;;,,=0 and Z*,,=0, it follows 
that g:i;,,=0 and 

Similarly, it follows that, if f ;; are the intrinsic components of the covari- 
ant derivative of the gradient, f,;=0f/ds‘, of an invariant function f, then 
SF .is=f,i:. Hence, a necessary and sufficient condition that the vector with the 
intrinsic covariant components a; be the gradient of a function is that a;,; be 
a symmetric tensor. 

A little consideration shows that this last proposition should be simply a 
restatement of Theorem 1. As a matter of fact, it is readily proved that 


(22) 


Invariant form of Ci;x. Setting 
(23) = = genBi;"*, 
we get, from (18), 
(24) — = 


Since giz,;=0, we also have 


Ogix 
Osi 


(25) Cin + Crs = 


* We might have started with these forms, with C;,* unknown, and then derived (17) from them. 
t This corresponds to the fact that in (17) the second subscript in C;;* indicates the second differ- 
entiation in the formation of the last term. 


0a; 0a; 
— 055 = — — — — 
0s? Os* 


552 W. C. GRAUSTEIN [July 


Equations (24) and (25) are m* in number and yield a unique solution for 
Cijx, namely* 
26 Cin = — 
(26) | Osi ds* 


+ Big t+ — Bus}. 


The corresponding expression for C;* is readily obtained. 

4. Geometric interpretations of invariant Christoffel symbols. These have 
to do with the curvature and associate curvature vectors of the congruences 
of the given ennuple. 

If a‘ are the intrinsic contravariant components of the field of unit vectors 
tangent to the directed curves C of a given congruence, the vector 


da’ 
(27a) a= ( + C;;7a* 
ds? 


is the curvature vector of the curves C and its identical vanishing is the condi- 
tion that the curves C be geodesics. 
The vector 


our 
(27b) a (— + Cif ‘a, 
ds? 


where 5‘ are the intrinsic components of an arbitrary field of unit vectors, 
is known as the associate curvature vector of this vector field with respect 
to the curves C. It is identically the null vector when and only when the vec- 
tors of the field are parallel, in the sense of Levi-Civita, with respect to the 
curves C. 

If the vector-field b originates as the field of unit vectors tangent to the 
directed curves K of a certain congruence, we call the vector (27b) the asso- 
ciate curvature vector of the curves K with respect to the curves C and say 
that the curves K are parallel with respect to the curves C when and only 
when it is identically null. 

Curvature vectors of the given congruences. For the curves C, of the en- 
nuple E, a’ =a,|"=6;. Hence (27a) becomes =C;,’. 

* The corresponding formula in Hessenberg, loc. cit., p. 211, has the same form, though Hessen- 
berg employs, instead of ds* and 0/ds‘, the Pfaffians du‘=ds‘/p; and the corresponding derivatives 
0/du‘ = p;(8/ds*), where p; are invariants. This is, of course, to be expected, inasmuch as equations 
(24) and (25) are obviously unchanged by this change of Pfaffians. 


The change of Pfaffians still leaves the space referred to the ennuple EZ. The transformations 
which it effects on the fundamental quantities are readily found to be g,; = pip;gsj and 


Ops Op; Ops 
— — pid; —-, — 
és? os? 
It will be clear from these relations, after the perusal of the next two sections, why it is essential 
that we use ds‘ as the basic Pfaffians. 


x 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 553 


THeEorEM 3. The intrinsic components, and of the curvature 
vector of the curves C are respectively Cy,‘ and Crri:* 


(28) = Can't, = Cans 


According to Theorem 2, Cra; and Cy,‘ are respectively the orthogonal 
and parallel (parallel and orthogonal) projections of the curvature vector 
of the curves C; on the ith axis of the intrinsic contravariant (covariant) 
coordinate system at P:(x‘). Since the curvature vector of a curve is per- 
pendicular to the curve, Chr, should be zero and this is the case. 


COROLLARY. A necessary and sufficient condition that the curves Cy be geodes- 
ics is that or Cans=0,7=1, 2,-- 

Associate curvature vectors. When we set b”=a,|" and a‘=a,|’, (27b) be- 
comes 

THEOREM 4. The intrinsic components, cn|' and crx|i, of the associate 
curvature vector of the curves Cy with respect to the curves C;, are respectively 
and 


(29) cre|* = Cr’, = Cars 


In the sense of Theorem 2, Cx; and Cy‘ are respectively the orthogonal 
and parallel projections, on the curves C;, of the associate curvature vector 
of the curves C; with respect to the curves C;. In particular, Cyr, =0; this 
vector is perpendicular to C,. 


Coroiary. The curves C; are parallel with respect to the curves C; if and 
only if Cr.*=0 or Crrs=0, 2,-- +, 

Geometric inter pretation of intrinsic covariant differentiation. The geometric 
significance of the first of formulas (21) is now clear. 


THEOREM 5. The (i, j)th intrinsic component of the covariant derivative of a 
covariant tensor is equal to the directional derivative, along C;, of the ith in- 
trinsic component of the vector, minus the scalar product of the vector with the 
associate curvature vector of the curves C; with res pect to the curves C;. 


In particular, the (, 7)th component reduces to da;/ds’ if and only if the 
vector is always perpendicular to the associate curvature vector in question. 
We conclude also: (a) for every vector a, but for a fixed i, a;,; reduces to 
0a;/0s' for 7=1, 2, - - - , m when and only when the curves C; are parallel 
with respect to all the curves C;, 7=1,2, - - - ,m; and (b) for every vector a, 


* For the case of an orthogonal ennuple, this theorem is known; see Levi-Civita, The Absolute 
Differential Calculus, p. 275. 


G= 

9). 


554 W. C. GRAUSTEIN (July 


but for a fixed 7, a;,; reduces to da;/ds‘ for i=1, 2, - - - , m if and only if all 
the curves C;,i=1, 2, - - - , m, are parallel with respect to the curves C;. In 
particular, the curves C; in case (a), and the curves C; in case (b), must be 
geodesics. 

From the geometric interpretation of a;,; we may pass to one for a‘; by 
means of the relation a‘ ;=g'a, |;. 

Geodesics and parallelism. Suppose that there is given a directed curve C: 
x'=x‘(s), expressed parametrically in terms of the arc s, and let d‘ and a‘ be 
the ordinary and intrinsic contravariant components of the unit vector tan- 
gent to C at an arbitrary point P of C. 

For the curve C, dx‘=dids. But dx‘=(dx‘/ds‘)dsi. Thus, (dx‘/ds‘)dsi 
= dds. Hence, since d'=a‘(0x‘/ds‘), it follows that ds‘=a‘ds. For the curve 
C, we have, then, 


(30) 


Applying these results to the right-hand side of (27a), set equal to zero, 
we obtain, as the conditions that the curve C be a geodesic: 


ds? 
where d?s"/ds? stands for (d/ds) (ds*/ds). 
From (27b) we obtain similar conditions that unit vectors b in the points 
of the curve C be parallel with respect to the curve C. 
Coefficients of rotation. The coefficients of rotation of the ennuple £, pat- 
terned after Ricci’s coefficients of rotation for an orthogonal ennuple, are of 
two kinds, namely, 


Yin = a,| ‘a,| = a'| ‘a,| 


It is readily shown that 


= — Cart + = = — Cre’, 


and hence, in case E is an orthogonal ennuple, that 
= = — Cart = — Cre’. 


Thus, the coefficients of rotation of any ennuple £ are identical, essentially, 
with the invariant Christoffel symbols formed for the ennuple. Theorems 3 
and 4 furnish, then, simple geometric interpretations of the coefficients of 
rotation. 


ds ds 4 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 555 


5. The distantial spread vector. The associate curvature vector of the 
congruence of curves C;, with respect to the congruence of curves C; is based 
on angle. Since its length is a measure of the deviation from parallelism of 
the first congruence with respect to the second, we may call it, in accordance 
with a terminology employed in a previous paper,* the angular spread vector 
of the curves C;, with respect to the curves C,. 

We shall now introduce for the two congruences of curves a “spread vec- 
tor” which is based on distance and which we shall call the distantial spread 
vector of the two congruences, in a given order. 

Let P be an arbitrarily chosen point in V, and let C;, and C;, be the curves 
of the two congruences which pass through P. Mark on C;, the point P; at the 
directed distance As* from P and on C, the point P2 at the directed distance 
As* from P. On the curve of the kth congruence through P; mark the point Q, 
at the directed distance As* from P;, and on the curve of the /th congruence 
through P, mark the point Q2 at the directed distance As* from P2, and draw 
the vector 0,0» joining Q, to Q2. Then, the limit of the ratio 0,02/(As"As*), 
when As“ and As* approach zero, exists and is defined as the distantial spread 
vector, at P, of the congruences of curves C, and C;, in this order. 

Tf bax | are the ordinary contravariant components of this vector, the 


definition says that 
wi — zt 
Onk | ‘= lim 
As* asko As*As* 
where (z*) and (w) are respectively the coordinates of Q; and Qs. 
It is readily shown that 


= xi + b+ h? + 2—— —hk+ BY 


where = As* and k= As* and the coefficients are evaluated for P. From 
these coordinates for Q; we obtain the coordinates (w*) of 0. by interchanging 
h and k and 0/ds* and 0/ds*. Thus we find that 


It follows from this result and (7b) that bax | ‘= B,,', where bax are the 
intrinsic components of the distantial spread vector. 


THEOREM 6. The intrinsic components, brx|* and dre li of the distantial 
spread vector of the curves C;, and the curves C;, are respectively Byx' and Bre: 


(31) bax |i = Bry’, bax |; = Bnrii- 


* Graustein, loc. cit., p. 559. 


j 


556 W. C. GRAUSTEIN [July 


We thus have simple geometric interpretations of the invariants B. 
It is evident that —b,x|*. In particular, ba,|*=0. 
By virtue of (28), (29), and (31), relations (18) and (24) become 


(32) — = Can |e — Coals = 
Thus, the difference between the angular spread vector of the curves C, with 
respect to the curves C; and that of the curves C; with respect to the curves 
C, is equal to the distantial spread vector of the curves C, and C,. In par- 
ticular: 

THEOREM 7. If two of the three spread vectors of two congruences are null, 
the third is also. 

Further interpretations of the distantial spread vector are given in §§6, 12. 

Some identities. We note, for future use, the identical relations 


— By” = + + 


(33) ——By™+ + 


which are readily established by substituting for the B’s their values from 
(7a) and making use of the integrability conditions (6). 

The corresponding relations for the B;;, are obtained from these by multi- 


plying by gym, summing over m, and applying (25). They are 


0 0 
(34) Bite + + = DirpBi’ + DirpBii’ + 


where 


= Cir + ire 
ast Pp 


Dirp = Birp + 

6. Applications and interpretations of distantial spread vectors. In this 
connection we shall first discuss the question of the inclusion of the curves 
of two or more congruences in subspaces of V,, and show that it finds its 
answer in conditions on the distantial spread vectors of the congruences. 

A family of r-dimensional surfaces (r=2,---, m—1) consists of the 
coo*-r y-dimensional surfaces defined by m—r equations of the form 
x*, x") =i, i=1, where ¢’, ¢’, are 
functionally independent and ¢, ¢,-+-, ¢,-, are arbitrary constants. A 
family of (n—1)-dimensional surfaces is called a family of hypersurfaces. 

A family of r-dimensional surfaces is said to contain a congruence of 
curves, or the congruence is said to lie in it, if each surface of the family con- 
tains ©’~! curves of the congruence. 


ig 
4g 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 557 


THEOREM 8. The family of hypersurfaces (x', x”, - - - , x") =const. con- 
tains a given congruence of curves if and only if the directional derivative of 
along the curves of the congruences is identically zero. 


The theorem is self-evident. 


THEOREM 9. The curves of r linearly independent congruences of curves lie in 
a family of r-dimensional surfaces if and only tf the distantial spread vectors of 
the congruences, taken in pairs, are linear combinations of the r tangent vectors 
of the congruences.* 


Without loss of generality, we may assume that the given congruences 
are the first ry congruences of the ennuple Z. According to Theorem 8, the 
family of hypersurfaces ¢=const. contains these r congruences if and only 
if 0f/ds*=9, i=1, 2,---, r. Hence, the r congruences lie in a family of 
r-dimensional surfaces if and only if the system of differential equations 


is completely integrable. But the conditions of integrability, as obtained from 
(6), reduce to 
do 
=0 (i,j = 1,2,---,7), 


k>r ds* 


and hence are identically satisfied when and only when 
By* =0 =1,2,---, 


But, by (31) and (13b), these equations constitute necessary and sufficient 
conditions that the distantial spread vectors of the first r congruences, taken 
in pairs, are linearly dependent on the tangent vectors of these congruences. 

The theorem is of the greatest interest in the cases r=2 and r=n—1. We 
shall discuss these cases in detail, with the purpose of bringing out the bearing 
of distantial spread vectors on equidistance. In this discussion, indices 
a, b, c, - - - are fixed, whereas indices 7, 7, k, - - - vary from 1 to nm, except 
as otherwise stated. 

Case r =2. A typical instance of this case is the following. 


THEOREM 10. The two congruences consisting of the curves C, and C, of the 
ennu ple E lie in a family of two-dimensional surfaces if and only if 


(35) Bar* = 0 (k ¥ a, b). 


* For this theorem, expressed in terms of associate curvature vectors, see Struik, Grundsziige 
der mehrdimensionalen Differentialgeometrie, p. 53. 


4 (i = 1, 2,---,7) 
Os* 


558 W. C. GRAUSTEIN [July 


The pair of differential equations 0¢/ds*=0, 0¢/ds*=0 have, then, n—2 
functionally independent solutions, ¢*, ka, b, and the m—2 equations 
o*=c,, k¥a, b, define the family of two-dimensional surfaces containing the 
curves C, and C,. 

The individual equations 0¢/ds*=0 and 0¢/ds*=0 each have n—1 inde- 
pendent solutions, »—2 of which may be taken in each case as ¢*, ka, b. 
Let the (7—1)st be ¢* in the case of the equation d¢/ds*=0, and ¢°, in the 
case of the equation 0¢/ds*=0. Then, the congruence of curves C, is repre- 
sented by the equations ¢* =c,, ka, and that of the curves C by the equa- 
tions =c,, k#b. 

The conditions of Theorem 10 demand the vanishing of all the com- 
ponents of the distantial spread vector of the congruences of curves C, and 
C, except B,»* and B,,°. These two components have geometric interpreta- 
tions in terms of the concept of the distantial spread of the one congruence 
with respect to the other formulated by R. M. Peters* as a generalization 
of a corresponding concept for the case 2 =2.T 

To define this concept, we note first that on an arbitrary but fixed sur- 
face, S2, of the family of two-dimensional surfaces containing the given con- 
gruences, there are «©! curves C,, defined by the equation ¢’=const., and 
co! curves C,, defined by the equation ¢*=const. Restricting ourselves for 
the moment to these curves C, and C,, we form the logarithmic directional 
derivative in the positive direction of the curve C,, of the distance, measured 
along an arbitrary curve Cy, between the curve C,: ¢’=¢5 and a neighboring 
curve C.: ¢’=¢0+A¢’. Then the limit of this derivative, when Ag’ ap- 
proaches zero, namely, 

dg? 


Os* 


is the distantial spread of the congruence of curves C, with respect to the con- 
gruence of curves Cy. It vanishes identically when and only when on each sur- 
face S, the curves C, are equidistant with respect to the curves C;, in that 
each two of them cut segments of equal length from the curves C3. 

The components B,,* and B,,° of the distantial spread vector of the con- 
gruences of curves C, and C, are essentially the distantial spreads of the two 
congruences with respect to one another. For we conclude from the identity 


* Peters, Parallelism and equidistance in Riemannian geometry, offered to the American Journal 
of Mathematics. 
Tt Graustein, loc. cit., p. 561. 


a 


1934) THE GEOMETRY OF RIEMANNIAN SPACES 


inasmuch as 0¢°/ds* =0, kX}, that 


0 Og? 
(36a) Ba’ = — log 
Os* ds® 


and, in a similar fashion, find that 


(36b) B 1 
a 
ds? Os* 
These results, together with the conclusions they imply, are summarized, 
in general form, in the following theorems. 


THEOREM 11. Two congruences of curves C and K lie in a family of two- 
dimensional surfaces, S2, if and only if their distantial spread vector lies always 
in the plane of their tangent vectors. The component, then, in the direction of the 
curves C, of the distantial spread vector of the curves C and K (in this order) 
is the negative of the distantial spread of the curves K with respect to the curves C, 
and the component in the direction of the curves K is equal to the distantial spread 
of the curves C with respect to the curves K. A necessary and sufficient condition 
that on each of the surfaces Sz the curves of one congruence be equidistant with 
respect to those of the second is that the distantial spread vector of the congruences 
lie along the tangent vector of the first-named congruence. 

THEOREM 12. The distantial spread vector of two congruences is identically 
the null vector if and only if (a) the two congruences lie in a family of two- 
dimensional surfaces S2, and (b) on each surface S: the curves of each congruence 
are equidistant with respect to those of the other, that is, clothe the surface in the 
sense of Tchebycheff. 


Case r=n—1. Here we have the following typical result. 


THEOREM 13. A necessary and sufficient condition that the n—1 congruences 
of curves C;,iXa, of the ennuple E lie in a family of hypersurfaces is that 


(37) = 0 (i, 7 a). 


The n—1 differential equations d¢/ds‘=0, ia, have, then, a solution, 
¢*, other than a constant, and the equation ¢*=const. defines the hypersur- 
faces, S,-1, in which the »—1 congruences lie. 

Since each curve C, of the ath congruence meets each of the hypersurfaces 
S,-1 in just one point, ¢* is a parameter common to all the curves C,. In 
terms of this parameter, the differential of arc of curves C, has the value 


(38) dst = (=) 
ast 


559 

| 


560 W. C. GRAUSTEIN 
For, inasmuch as 0¢°/ds‘ =0, 

(39) 

and hence 

(40) d¢* = ds*. 


In this case we shall find useful another concept developed by Peters,* 
namely that of the distantial spread, in the direction of the curves C;, of the 
hypersurfaces S,_, with respect to the curves C,,i#a. If dis the distance, meas- 
ured along an arbitrary curve C,, between the hypersurface ¢*=¢> and a 
neighboring hypersurface ¢°=¢)+A¢", this distantial spread is the limit, 
when A¢* approaches zero, of the logarithmic derivative of d in the positive 
direction of the curves C;, ia. Its value, as obtained from (38), is found to be 

| as* | 

As noted by Peters, a necessary and sufficient condition that the hyper- 
surfaces S,_; be equidistant with respect to the curves C, in that each two 
of them cut segments of equal length from all the curves C, is that the dis- 
tantial spreads of the hypersurfaces S,_: with respect to the curves C, in the 
directions of the curves C;, ia, all vanish. 

To express the fact that the distantial spread, in the direction of the 
curves C,, of the hypersurfaces S,_, with respect to the curves C, vanishes, 
we shall say that the hypersurfaces are equidistant with respect to the curves 
C, in the direction of the curves C}. 

The conditions of Theorem 13 demand the vanishing of the ath com- 
ponents of the distantial spread vectors of each two of the »—1 given con- 
gruences. The ath components of the distantial spread vectors of each of these 
congruences taken with the congruence of curves C, are precisely the dis- 
tantial spreads just defined. For, employing the method of the preceding 
case, we readily find that 


(i a). 


(41) i = - log | 


The results we have obtained may be summarized as follows. 


* Loc. cit. 


[Juy 
(ij = 1,2,---,m), 
: 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 561 


THEOREM 14. The curves of n—1 linearly independent congruences lie in a 
family of hypersurfaces S,_1 if and only if the distantial spread vectors of each 
two of the congruences are linearly dependent on the tangent vectors of the con- 
gruences. If the congruences are those of the ennuple E other than that of the curves 
Ca, the component, in the direction of the curves Ca, of the distantial spread vector 
of the ith and ath congruences (ia) is the distantial spread, in the direction of 
the curves C;, of the hypersurfaces S,-1 with respect to the curves Cy. A neces- 
sary and sufficient condition that this component vanish is that the hypersurfaces 
Sn-1 be equidistant with respect to the curves C., in the direction of the curves C;. 


THEOREM 15. The distantial spread vectors of each two congruences of an en- 
nuple are linear combinations of the tangent vectors of n—1 of the congruences if 
and only if these n—1 congruences lie in a family of hypersurfaces and the 
hypersurfaces are equidistant with respect to the curves of the nth congruence.* 


Returning now to the analytic discussion, we note the equivalence of 
equations (39) and (40), and hence conclude: 


THEOREM 16. The differential of arc ds* of the curves C, of an ennuple E 
possesses an integrating factor if and only if the other curves of E lie in a family 
of hypersurfaces. If ¢*=const. is an equation of these hypersurfaces, then 
0g°/ds* is an integrating factor of ds*. 


It will be instructive to look at this question from another point of view. 
According to Theorem 1, Jds is an exact differential if and only if B;*=0, 
and 

log I 
(42) — = — (it a). 
But Theorem 16 and equation (37) guarantee that the conditions B;,*=0, 
i, 7a, are sufficient that ds* possess an integrating factor. Hence, the differ- 
ential equations (42) must be compatible. Now, on the one hand, equations 
(42) are equivalent, by Theorem (16), to equations (41), and, on the other 
hand, the conditions for their complete integrability are 
and these equations, by virtue of (37), are special cases of the identity (33). 

7. Parametric, Tchebycheff, and Cartesian ennuples of congruences. Para- 
metric ennuples. If the n congruences of curves of an ennuple are the intersec- 
tions of m families of hypersurfaces, taken n—1 at a time, we shall call the 


* For applications of distantial spreads in the case of a parametric ennuple, see Peters, loc. cit. 


a 


562 W. C. GRAUSTEIN [July 


ennuple parametric. If ¢‘=c; is the equation of the ith family of hypersur- 
faces, S{_,, and it is the curves C; of the ith congruence which do not lie in 
the hypersurfaces St_1,i=1, 2, - - - , m, then ¢‘is a parameter for the curves 
C; and ¢', ¢”, - - - ,@" may be used as parameters (coordinates) in V,. 

It is evident geometrically that an ennuple is parametric when and only 
when each m—1 congruences belonging to it lie in a corresponding family of 
hypersurfaces, or, what amounts to the same thing, according to Theorem 16, 
if and only if the differential of arc of the curves of each congruence has an 
integrating factor. Theorem 13 and the subsequent developments lead, then, 
to the following conclusions. 


THEOREM 17. A necessary and sufficient condition that the ennuple E be 
parametric is that 


(43) =0 (k ¥i,7;1,7,k = 1,2,---,m). 
If, then, p‘ is a parameter for the curves Ci, 


0g* 
(44) By* = — — log 


Ss 


Returning to Theorem 10, we note that equations (35), when a and 8, 
as well as k, vary from 1 to m, are identical with (43). Hence: 


THEOREM 18. An ennuple is parametric if and only if each two of its con- 
gruences lie in a corresponding family of two-dimensional surfaces.* 

From Theorems 11 and 14 we get two interpretations of the quantities 
Buk of (44) ° 

THEOREM 19. If the ennuple E is parametric, then B;,* is equal to the dis- 
tantial spread of the curves C; with respect to the curves C,, and also to the dis- 
tantial spread, in the direction of the curves C;, of the hypersurfaces St_1 with 
respect to the curves C,. 

From (43) and (18) we conclude: 

THEOREM 20. A necessary and sufficient condition that the ennuple E be 
parametric is that C;* =C;*, ki, j, that is, that the angular spread vectors of 
each two congruences of E with respect to one another have the same intrinsic 
contravariant components external to the plane of the tangent vectors of the two 
congruences. 

In case E is an orthogonal ennuple, it is readily shown that the equations 


* This condition, expressed in analytic form, is to be found in Bortolotti, Reti di Cebiceff e 
sistemi coniugati nelle V, riemanniane, Rendiconti della Accademia dei Lincei, (6), vol. 5 (1927), 
pp. 741-745. 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 563 


Cit =C;#, ki, 7, are equivalent, by virtue of (25), to the equations C;* =0 
(i,j, RA). 

COROLLARY. An orthogonal ennuple is parametric, that is, each of its con- 
gruences is normal, if and only if the angular spread vectors of each two of its 
congruences with respect to one another lie in the plane of the tangent vectors of 
the two congruences. 


Inasmuch as B;;* =0, k Xi, 7, it follows that, if we set 


ViF OF 4 


the conditions of integrability (6) take the form 


Vsi 

and the conditions (8) that fids‘ be an exact differential become V;fi/ Vs? 
Vili/ Vs*, i,j=1, 2, 

The expression V7,;F/ Vs‘ may be called the modified directional derivative 
of F in the direction of the curves C; with respect to the curves C;. It is a generali- 
zation of the modified directional derivative employed, to good effect, in the 
theory of ordinary surfaces.* In comparison with the covariant derivative, 
it has the advantage that it involves, besides F, only the B’s. On the other 


hand, it does not have tensor character, and cannot be extended to apply to 
an arbitrary ennuple of congruences. 

Tchebycheff ennuples. If each differential of arc, ds*, of the ennuple £ is 
an exact differential of a function s* of the x’s, E is a parametric ennuple with 
the variables s* as parameters. The linear element, referred to these parame- 
ters, is 


(45) ds? = gi; ds‘ds’, = COS 
Inasmuch as s* is the common arc of all the curves C,, R=1, 2, -- +, m, the 
ennuple is a generalization of a Tchebycheff system of curves clothing a two- 
dimensional surface and may appropriately be called a Tchebycheff ennuple. 

According to Theorem 1, ds* is an exact differential if and only if B;* =0, 
Hence: 


THEOREM 21. An ennuple of congruences is a Tchebycheff ennuple if, and 
only if, for each two congruences of the ennuple, the distantial spread vector is a 
null vector, or the angular spread vector of the one with respect to the other is 
identical with that of the second with respect to the first. 


* Graustein, loc. cit., p. 575. The generalization was first discovered by Ruth M. Peters, in an- 
other connection. 


q 
Vi @ Vi 


564 W. C. GRAUSTEIN [July 


It is to be noted that a parametric ennuple is, in particular, a Tchebycheff 
ennuple if and only if the &th family of hypersurfaces is an equidistant family 
with respect to the kth congruence of curves, k=1, 2, - - - , m; see Theorem 


15. 
From (24) and (25) we conclude, inasmuch as Cy, =0, that 


Og rk 


Crm = — Benn. 


From these relations we may obtain interesting conditions under which the 
curves C;, are geodesics. In particular, we have 

THEOREM 22. The curves of a Tchebycheff ennuple are all geodesics if and 
only if the angle between the curves of each two congruences is constant along the 
curves of both congruences.* 

It follows that the linear element (45), where s‘ are actual coordinates and 
gi; is independent of s‘ and s’, 7, 7=1, 2, - - - , m, is characteristic of a space 
which contains a Tchebycheff ennuple of geodesics. 

Cartesian ennuples. When E is a Tchebycheff ennuple and g;,; are con- 
stants, the linear element (45) characterizes V, as a euclidean space referred 
to the ennuple of congruences of coordinate curves of a Cartesian coordinate 
system in which the unit of measure for each coordinate is the unit distance 
of the space. Accordingly, we shall call a Tchebycheff ennuple for which the 
the gi; are constant a Cartesian ennuple. We may then state 

THEOREM 23. A necessary and sufficient condition that V,, be euclidean is 
that it contain a Cartesian ennuple of congruences. 

This type of ennuple may be characterized analytically and geometrically 
as follows. 

THEOREM 24. An ennuple of congruences is a Cartesian ennuple if and only 


if 
(46) B;;* = 0, ast (i,j, k= 1,2,---, n), 


or 
(47) Ci;* = 0 (i,j,k = 1,2,---,m), 


* The counterpart of this theorem, to the effect that the curves of each congruence of a Tcheby- 
cheff ennuple are parallel with respect to those of every other congruence if and only if the angle be- 
tween the curves of each two congruences is constant along the curves of the remaining congruences, 
is given by Bortolotti, loc. cit., p. 741. It is to be noted that Bortolotti’s conception of a Tchebycheff 
ennuple differs from the one here used. 


a 
‘ 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 565 


that is, if and only if itis a Tchebycheff ennuple each two of whose congruences 
intersect under a constant angle or has the property that the curves of each con- 
gruence are geodesics and are parallel with respect to the curves of every other 
congruence. 


The equivalence of equations (46) and (47) follows from previous rela- 
tions. If dg;;/ds* =0 and B,;, =0, then (26) says that C,;,=0; and, if Ci;,=0, 
(25) and (24) tell us that dg,;/8s* =0 and B,;,=0. But the equations B,;,=0 
and B;* =0 are equivalent, and also the equations C;;, =0 and C;* =0. 

8. Ennuples Cartesian at a point. We shall say that an ennuple of con- 
gruences is Cartesian at a particular point P if it behaves at P like a Cartesian 
ennuple, that is, if it has at P the analytic or geometric characteristics de- 
scribed in Theorem 23. 


THEOREM 25. If x‘ are geodesic coordinates at P, the ennuple E is Cartesian 
at P if and only if the ordinary components, d;| ‘, of the unit tangent vectors of E 
behave like constants at P. 


Since x‘ are geodesic coordinates at P, C;*=0 at P. Hence, according to 
(17), C;*=0 at P if and only if @,|* behave like constants at P. Thus, the 
theorem is proved. 

Inasmuch as, when x‘ are geodesic coordinates at P, 2;; behave like con- 
stants at P, it follows that, if the ordinary components of a vector field be- 
have like constants at P, so also do the ordinary components of the corre- 
sponding field of unit vectors. Hence, we conclude: 


THEOREM 26. There exist infinitely many ennuples which are Cartesian at 
a given point P and have at P prescribed tangent vectors. 


If d,| ‘ behave like constants at P, so also do a"| ;. Hence, if x‘ are geodesic 
coordinates at P: (0,0, - - - ,0) and the ennuple E is Cartesian at P, 


ds* = G*| ;dxi = (a*| jodxi + +--+. 


Consequently, if terms in x‘ of the second degree and higher are neglected, 
the differentials of arc ds* become exact differentials and we may write 
sh= Jot! 

Comparison with geodesic coordinates. We assume now that the ennup!e Z 
is a parametric ennuple and inquire whether, if parameters for E are geodesic 
at a point P, Eis Cartesian at P, and conversely. 

We may assume that £ is the parametric ennuple corresponding to the 
basic coordinates x‘. According to Theorem 17, we have, then, 


Oxi 
(48) ij = 1, 2,--- , 
Os* 


566 W C. GRAUSTEIN 


whereas =0 for k ¥i, 7. 
Since in this case 


and hence 


equation (17) becomes 


i k 


whence 


0 
(49b) = + — log | — |. 


Furthermore, we have, from (10a), since g;;=1, 


asi} 
If the coordinates x‘ are geodesic at P, C;*=0 at P; and ,;, and hence 
0x‘/ds‘, act like constants at P. It follows, then, from (49a), that C,,*=0 
at P, that is, that Z is Cartesian at P. 
If, conversely, £ is Cartesian at P, C;* =0 and B;* =0 at P. Hence, from 


(49a) and (48), we find that C;*=0 at P except when i=j =k, whereas from 
(49b) we get 


Thus, the coordinates x‘ are not necessarily geodesic at P; they are geodesic 
if and only if 0?x‘/ds®=0 at P. However, it is evident that there exist co- 
ordinates X‘ for E which are geodesic at P; we have merely to choose 
X‘=X‘(x‘), i=1, 2,---,m, so that 0°X‘/ds®=0 at P.* 

We have now answered the proposed question. 

THEOREM 27. There exist, for a parametric ennuple, coordinates which are 
geodesic at a given point P if and only if the ennuple is Cartesian at P. 

* It would appear from this discussion that a completely geometric characterization of geodesic 


coordinates is impossible. In this connection, see Levi-Civita, The Absolute Differential Calculus, 
p. 168. 


{Juy 
—=0 (i ~#7;4,7 = 1,2,--+ , mn), 

Os? 

: 
—— —_= — 1 1, = n 
dx! dx! asi 
a axi i 
+ leg |, 
* asi Os? 
Ox? 
Tutt tog || = 0 at 
Os* j 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 567 


It is evident from this discussion that the concept of an ennuple Cartesian 
at a point is more fundamental than the concept of geodesic coordinates. 
Furthermore it is also more general, in that an ennuple Cartesian at a point 
does not need to be a parametric ennuple at all. 

That an ennuple Cartesian at a point serves all the purposes for which 
geodesic coordinates are ordinarily employed is guaranteed by the following 
proposition. 

THEOREM 28. When the ennuple of reference is Cartesian at a point P, then 
at P intrinsic covariant differentiation becomes directional differentiation and 
second cross directional derivatives are independent of the order of differentiation. 


From (6) and Theorem 21 it follows that second cross directional deriva- 
tives are independent of the order of differentiation at a given point P if and 
only if the ennuple of reference behaves like a Tchebycheff ennuple at P. 

9. Intrinsic components of the Riemann tensors. It may be shown in vari- 
ous ways that the intrinsic components, referred to the ennuple £, of the 
Riemann tensor with the ordinary components 


(S30) Rin = + — 
are 


+ my? — + Cin’ By”. 


(51) Rin = 


From = gniR' ix, it follows that 


dsi ds* 

are the intrinsic components of the Riemann tensor whose ordinary com- 

ponents are = 

The identities satisfied by the covariant Riemann tensor have, of course, 
the same form in terms of its intrinsic components as in terms of its ordinary 
components. In this connection, it is of interest to note that the identities 
Rrijet+Rijeit+Rinij =O are equivalent to the identities (34).* 

The conditions of integrability for covariant differentiation also have the 
same forms in terms of intrinsic components as in terms of ordinary com- 
ponents. Thus, if @;, a;,;, and a;,;, are respectively the intrinsic components 
of a covariant vector and its first and second covariant derivatives, then 

* See Dei, Sulle relazioni differenziali che legano i coefficienti di rotazione del Ricci, Rendiconti della 


Accademia dei Lincei, (5), vol. 32 (1923), pp. 474-478, where this conclusion is reached in essence, 
though not in form, in the case of an orthogonal ennuple. 


3 


568 W. C. GRAUSTEIN 


(53) jk — = OR 
Again, in the case of a contravariant vector, we have 

Setting a‘=a,|*=6i in (54) and a;=as|;=gai in (53), we obtain the 
following expressions for R*;;, and Ryj;x: 

Rix = a; |" = a; |* ik, Rix = an | = an|i nis 

in terms of the derivatives of the unit vectors tangent to the curves of Z. On 
the other hand, formulas (51) and (52), in light of (29) and (31), furnish ex- 
pressions for R*;;, and R,;;, in terms of the curvature and angular and dis- 
tantial spread vectors of Z, namely, 


Rin = Cin + Cim jx |", 
= Cie — + Cim| adie |”. 
To these relations may be adjoined the simple expressions 
= Carle = 


for the curvature and angular spread vectors of EZ in terms of the tangent 
vectors.* 

The Ricci tensor. From (50) we find as the components of the Ricci ten- 
sor, Ry; = 


Ri; -Cin* —C;;* + Ca"Cn;* + CimeB 
Os! ds* 


Adding to the right-hand side of this equation the expression 


- Bt +> + 
2 \ds* Os! ds* 
whose value is readily shown to be zero by (33), and making use of (18) and 
the relation O(log g'/?)/ds‘=C,, we find the following symmetric expression 
for R;;: 
Ri; = (log 43 + + (C44! + Cx*) 
2 ds* 2 2 ds* 

— + Cys") Bim™ + Cir"Cjm*. 

Geometric interpretations. We note, without going into detail, that the 


* It is to be noted that none of the relations in this paragraph are invariant in form with respect 
to a change from intrinsic to ordinary components. 


[July 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 569 


intrinsic components of the Riemann and Ricci tensors, especially in case E 
is an orthogonal ennuple, have interesting geometrical interpretations in 
terms of the Riemannian and mean curvatures. 

10. Metric connections. The torsion vector. The most general connection 
which possesses a metric based on the tensor 2;; is obtained by employing, 
instead of the Christoffel symbols C,;*, arbitrary coefficients of connection, 
T;;*, such that 
(55) = — — =0 (i,j,k = 1,2,---,m). 


The skew-symmetric part of T’;;*, namely 


(56) = —T.*), 


has tensor character. It is called the torsion tensor of the connection. 

When the torsion tensor S,;* is known, the connection T;;;* is completely 
determined. For it is readily found, from (55) and (56), that 
(57) Tie = Cie + (Sim + Sins — 
where, for example, T';;, = Z,:1';;*. In particular, the connection is Riemannian 
if and only if the torsion tensor is null. 

By replacing C;;* and C,,' in (17) by T',;* and T,,,', we obtain equations 
which define the invariant coefficients of connection, I',;*, referred to the 
ennuple E. In the same way, we obtain from (19) and (21) the formulas for 
covariant differentiation with respect to the connection, and, from (27), the 
expressions for the new angular spread or curvature vectors. From the latter, 
it follows that I,‘ are the intrinsic contravariant components of the angular 
spread vector, vault, of the congruence of the curves C; with respect to the 
congruence of curves C,. On the other hand, the components B,,' of the dis- 
tantial spread vector, b,x|‘, of the curves C;, and C; remain the same as 
before; this vector is dependent only on the components, 2;;, of the metric. 

From the laws of transformation of T';;* and C;;* into T;;* and C,;*, it 
follows that T';;*—C,,;* is a tensor whose intrinsic components are I';;* —C,;*. 
Hence, we conclude from (57) that 


(58) Vise = + + Sins — Susi), 


where S;,z are the intrinsic components of the tensor Si;x. 
From these relations we obtain, by virtue of (24), the following expres- 
sions for the intrinsic components of the torsion tensor*: 


* An equation of the same form as this holds for the general linear connection; see Horak, loc. 
cit., p. 197. 


570 W. C. GRAUSTEIN [July 
(59) 2S;,* = (T;;* 5:*) B;;*. 


Torsion vectors. In a given oriented planar element at a point P choose two 
ordered infinitesimal vectors PP, and PP» such that the direction of rotation 
about P from the first vector to the second is that of the given orientation. 
Transport each vector along the other by the displacement determined by 
T,,;*, thus obtaining the new vectors P.O, and P,Q2. Then the limit of the 
ratio of the vector 0,Q» to the area of the parallelogram determined by the 
vectors PP, and PP, is a vector at P which is independent of these vectors, 
provided merely that they are chosen as described, and so pertains simply to 
the given oriented planar element. This vector is due to Cartan* and is called 
by him the torsion vector at P for the given oriented planar element. 

If the vectors PP; and PP, are respectively d,x‘ and d;x‘, the coordinates 
of Q, are, to within terms of higher order, 


Hence, the torsion vector at P for the given oriented planar element has the 
components 


2 csc 6S; | |’, 
where d,| ‘and d;| ‘are the unit vectors in the directions of PP, and PP; and 
¢ is the angle between them. 
We shall find it convenient to employ, instead of the torsion vector of 


Cartan, a lorsion vector for two ordered directions at a point. This we define by 
the equations 


(60) Sax |" 25; | 


where d,|‘ and 4;| ‘ are the unit vectors in the two directions. In particular, 
if d,|‘ and d,|‘ are the fields of unit vectors tangent to the curves of the /th 
and kth congruences of the ennuple E, we shall call 5, |" the torsion vector 
of these congruences, in the order given. 

A geometric interpretation of the torsion vector (60) is obtained by re- 
phrasing the definition of the torsion vector of Cartan. A second interpreta- 
tion, more useful to us, is the following: If, in the definition of the distantial 
spread vector of the congruences of curves C;, and Cx, we redefine Q; and Qz2 as the 
terminal points of the vectors at P, and P2 which are parallel respectively, ac- 
cording to the connection T',;*, to the vectors PP: and PP,, the definition becomes 
a description of the torsion vector of the congruences, in the order given. 


* Sur les variétés & connexion afine et la théorie de la relativité généralisée, Annales de Ecole 
Normale, (3), vol. 40 (1923), pp. 325-412. 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 571 


To establish this interpretation, it suffices to note that the coordinates of 

the new point Q;, are, to within terms of higher order, 
a? + dyxt —T 

It follows from the interpretation that a sufficient condition that the tor- 
sion vector of two congruences be identical with their distantial spread vector 
is that the curves of each congruence be parallel, according to the connection 
I’,;*, with respect to those of the other.* This is not a necessary condition, as 
we shall see shortly. 

Whereas the distantial spread vector of the two congruences depends 
only on the metric, the torsion vector depends also on the connection. If the 
connection is Riemannian, the torsion vector is always null. 

In terms of intrinsic components, referred to the ennuple Z, (60) becomes 


=— 2S; fan | ‘ax = — 2S,;'. 


Thus we have simple geometric interpretations of the intrinsic components 
of the tensor of torsion. 


THEOREM 29. The intrinsic components, sxx|" and Six|,, of the torsion vector 
of the curves Cy and C;, are respectively and 


= — = — 2S 


We may now interpret the important relations (59), by rewriting them 
in the form 


(61) = biz |" — (vas |" — vie |")- 
THEOREM 30. The difference between the distantial spread vector and the 


torsion vector of two ordered congruences is equal to the difference between the 
angular spread vectors of the two congruences with respect to one another. 


It is now clear that a necessary and sufficient condition that the torsion 
vector of two ordered congruences coincide with their distantial spread 
vector is that the angular spread vectors of the congruences with respect to 
one another be identical. 

We shall say that the connection T';;* is symmetric with respect to the 
ennuple E if I',;*=T;;* for i, 7, k=1, 2, - - - , . From (59) or (61) we infer, 
then, the following proposition: 

THEOREM 31. The distantial spread and torsion vectors of each two congru- 


ences of the ennuple E are identical if and only if the connection is symmetric 
with respect to E. 


* The analytic content of this statement, in a different form, is to be found in Bortolotti, Paral- 
lelismi assoluti nelle Vn riemanniane, Atti del Reale Istituto Veneto, vol. 86 (1927), pp. 455-465. 


572 W. C. GRAUSTEIN [July 


For a given metric with the fundamental tensor ;;, there is a unique metric 
connection which is symmetric with respect to E. For it follows from g;;,,=0, 
or from (58), (59), and (26), that, if I';*=T,,*, then 


62 Pat = 4 


Spaces admitting absolute parallelism.* The given space is said to admit 
(complete) absolute parallelism if there exist m linearly independent fields of 
absolutely parallel vectors, or, what amounts to the same thing, if there 
exists an ennuple of congruences such that the curves of each congruence are 
parallel with respect to all congruences. It is evident from the geometric 
interpretation of I';;* that this is true of the ennuple Z if and only if T';;* =0 
for i, 7, R=1, 2, ---,m, and hence, according to (62), if and only if I’;;* is 
symmetric in i, 7 and g;; are constants. 


THEOREM 32. A metric connection admits absolute parallelism if and only 
if there exists an ennuple of congruences, E, which has constant angles and with 
respect to which the connection is symmetric. 


Since I';;*=0, it follows that the intrinsic components, referred to E, of 
the covariant derivative of a tensor are simply the directional derivatives, 
along the curves of E, of the intrinsic components of the tensor. 

The intrinsic components of the curvature tensor of the connection are 
obtained by replacing C;;* by T’;;* in (51). Hence, when the connection ad- 
mits absolute parallelism, the curvature tensor is actually zero, as it should 
be. 

11. Transformation from one ennuple of congruences to a second. We re- 
turn now to the study of Riemannian space, and assume that there is given, 
in addition to Z, a second ennuple, E’, consisting of the congruences of di- 
rected curves C,’,i=1,2,--+-,m. 

We shall distinguish by primes symbols referred to the ennuple EZ’. For 
example, we shall denote by ii the (intrinsic) covariant components of the 
fundamental tensor, referred to EZ’. 

The components, referred to £, of the tangent and conjugate vectors and 
the distantial and angular spread vectors of the congruences of E we have 
denoted by a,|, a*|, bax|, caz|. The components, referred to E’, of the cor- 
responding fundamental vectors connected with the congruences of E’ we 
shall designate by ax |, a’*|, Bx |, vue |. According to the foregoing conven- 
tion, ax|, a*|, Bax|, Yax| then denote the components, referred to E, of the 


* For a review of this subject, see Eisenhart, Spaces admitting complete absolute parallelism, 
Bulletin of the American Mathematical Society, vol. 39 (1933), pp. 217-226. 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 573 


fundamental vectors pertaining to E’, and a’, a’*| , bys | ,cxx | the components, 
referred to EZ’, of the fundamental vectors pertaining to E. 
In §4 we noted that 


(63) 


are respectively the ordinary components and the components referred to 
E of the unit contravariant vector tangent to the directed curve C. It 


follows, then, that 
dsi ds‘ dxi ast df 


ds ds ds 


Similar formulas hold when C is referred to E’ instead of to E. 

The second of formulas (63) suggests that we denote by ds‘/ds’* the 
contravariant components, referred to EZ, of the unit vector tangent to the 
general curve C;’ of the Ath congruence of E’: 


Os* 
Os’* 


(65) ah = 


and by 0s’*/ds* the contravariant components, referred to E’, of the unit 
vector tangent to the general curve C; of the Ath congruence of E: 
ast 

From the first two formulas in (64) and the corresponding formulas men- 
tioned in connection with them, we have 


(66) 


Ost Oxi Ost Ox? dst 


Os* Os* Axi 


(67) 


where 0s’‘/dx/ has the same significance with respect to E’ as has 0s‘/dx/ with 
respect to E. 

Using (67) in conjunction with (2) and similar equations in dx‘/ds’‘ and 
0s’‘/Ax', we readily establish the fundamental relations 


Osi = Ost ‘ 


(68) 


which state that the systems of quantities ds‘/ds’‘ and 0s’‘/ds‘ are conjugate 
to one another. 


| = dst 
ds ds 
5 x? Osi ds 
Oxi 
j Ox? 
ash 


574 W. C. GRAUSTEIN [July 


According to the first of these equations, the Pfaffian (0s’*/ds‘)ds/ is zero 
for every curve of the ennuple EZ’ except a curve C} and for a curve C} is 
equal to the differential of arc, ds’‘, of the curve. Thus we obtain, as the rela- 
tions between ds‘ and ds’?: 

(69) ds’t = —dsi, ds‘ = —ds'i, 
ds! 

Applying the last of the equations (64), we find 


of as’i 

dst = Ast 
as the relations between the directional derivatives in the positive directions 
of the curves C; and those in the positive directions of the curves C}. 

Formulas (69) and (70) guarantee that the transformation from the com- 

ponents of a tensor, referred to E, to the components of the tensor, referred to E’, 
obeys the formal laws of tensor analysis. The relations between the two sets of 
components for the fundamental tensor are, for example, 
ds* ds! ds’* as’! 
Os"? ds? 
as’* dst 


8 Os* ast = 8 ast 


= 
(71) 


and the corresponding relations in the case of an arbitrary vector are 


(72) 


Tangent and conjugate vector-fields of E and E’. The components, referred 
respectively to E and LE’, of the field of unit vectors tangent to the curves C; 
of the ennuple E are 
(73a) y 

a = 

Os" 
Osi , 
Os" 


The formulas on the left are identical with (16b); those on the right follow 
from them by means of (72), (71), and (68). 


2 

a* = a* = gi 

Os? 

Osi 

Os* 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 575 


Similarly, we find, as the components of the Ath field of conjugate vectors 
associated with the ennuple £, 
ds" 


(74b) 


(74a) a*|; = 0, 


The components, referred respectively to Z’ and E, of the field of unit 
vectors tangent to the curves C;’ of the ennuple Z’ are 
(75a) a’, = 5, a, = 
dst 
while those of the Ath field of conjugate vectors associated with E’ are 


as’* 
(76a) a’*|; = 53, a* |, = 
th 
(76b) |i = gi |i = gi hi = 
Osi 


From the relations 
(77) 


it follows that, when z is fixed and j =1, 2, - - - , m, dsi/ds’‘ and ds’*/ds? are 
respectively components, referred to E, of the ith tangent and conjugate 
vectors associated with E’, whereas, when 7 is fixed andi=1, 2, - - - , , they 
are respectively components, referred to E’, of the jth conjugate and tangent 
vectors associated with £. 

Interpretations in terms of angles. If dn; is the angle which the /th tangent 
vector of E’ makes with the ith conjugate vector of E and ¢;; is the angle 
which the /th tangent vector of E makes with the ith conjugate vector of EZ’, 
it follows, either directly or by virtue of Theorem 2, that 

a,\' = = sec@;cosdni, = = sec 6; COS dai, 
where 6; is the angle between the ith tangent and conjugate vectors of E 
and @/ is the angle between the ith tangent and conjugate vectors of E’. 


2 
= 
= = — = —, 
Os? 


576 W. C. GRAUSTEIN [July 


If Eis an orthogonal ennuple, the angles 6; are all zero, and a,|*=0s*/ds’* 
= cos 80 that ds‘/ds’* are direction cosines. If E’ is also orthogonal, then 
and hence for all i, 

Returning to the general case, we remark that, inasmuch as we now have 
interpretations in terms of angles of and gi;, g*4, (see 
§2), we may write all of the formulas (71) and (73)—(76) in terms of angles. 
For example, if ,; is the angle between the Ath tangent vector of E and the 
kth tangent vector of E’, the second formulas in (73b) and (75b) both be- 
come, by application of Theorem 2, 


(79) cos Wak = >. COS wp; Sec 0; cos dx; = Cos wh; sec cos o4;- 

i 
Similarly, if xa, is the angle between the /th conjugate vector of EZ and the 
kth conjugate vector of E’, we obtain, from the second formula in either (74b) 
or (76b), 


COS Xak = > cos sec 8; cos = cos Qi; sec 0} cos djn. 
i i 
Here, w;; and Q;; are the angles defined in §2 for EZ, and w;; and Qi; are the 
corresponding angles for E’. 

Transformation of Christoffel symbols and the B’s. From equations (17) 
and similar equations for the ennuple E’, we readily obtain the equations of 
the transformation, 

80 Cla Os” os? Os? Os" 
of the Christoffel symbols C;;* for the ennuple £ into the Christoffel symbols 
Ci,* for the ennuple E’. 

From these equations follow directly those of the transformation of the 

symbols B;;* for E into the symbols Bi;* for E’, namely 
Os? Ost 
(81) Bi; — = B,,f = +( 


d ast ast 


If we apply the result of differentiating the first of equations (68) to the 
last term in (81), we find that (81) may be rewritten in the form 


r 
Pq 


Os? Os? Os? Os? ‘Os? 


( ds'* ds'* —) Os? 


Os’* 


and hence, by virtue of (22) and the relations @s’*/dst=a*|, and 
a’*| ;=6*, in the form 


a 
4 
7 
= 
4 
= 
i 


THE GEOMETRY OF RIEMANNIAN SPACES 


Os? Os? 


a! | 5 = (a*| 5.0 asi 


where the components of the covariant derivatives are, as indicated, re- 
ferred to E’.* 

Transformation of angular and distantial spread vectors. Appealing to the 
equations for E’ corresponding to (28) and (29), we conclude that the com- 
ponents, referred to E, of the angular spread and curvature vectors of the 
congruences of E’ are y;;|" =C:*(ds"/ds’*). It follows, then, from (80), that 
the equations of the transformation from the angular spread and curvature 
vectors of E to those of E’, expressed in terms of components referred to E, 
are 


Os? Os? Ost 


The inverse transformation, when expressed in terms of the components re- 
ferred to E’, that is, yi;|+ and cis |*, has the same form. 

Equations (82) constitute the generalizations to Riemannian geometry of the 
most general form of the fundamental relation of Liouville for geodesic curvatures 
on a two-dimensional surface.t But equations (82) are Christoffel’s equations 
in invariant form. Thus, we have a striking geometric interpretation of 
Christoffel’s famous formulas. 

By means of (31) and the corresponding equations for E’, we readily de- 
duce from (81) the equations of the transformation from the distantial spread 
vectors of the pairs of congruences of E to those of the pairs of congruences 
of Z’. Written in terms of components referred to E’, they are 

83 Os? Os? 
Since ds"/ds’=a,|", we infer from these equations the following theorem.} 


THEOREM 33. If E is an ennuple of Tchebycheff, a necessary and sufficient 
condition that E' be an ennuple of Tchebycheff is that 


0a; da; |" 


(j,j,7 = 1,2,---,m). 


* The first of these equations has a simple geometric interpretation. According to Theorem 21, 
B;*=0 characterizes ds’* as an exact differential; but, by Theorem 1, the vanishing of the quantities 
in the parenthesis is precisely the condition that (ds’*/ds") ds"=ds’* be exact. 

t Graustein, loc. cit., p. 570. 

tA generalization of Theorem 20 in Graustein, loc. cit., p. 580. This theorem is for the case 
n=2 and assumes that E is an orthogonal ennuple. 


1934] 577 
Me 
4 
A 
q 
A 
3 


578 W. C. GRAUSTEIN . [July 


An obvious solution of these equations is a;|" constant. But a;|" must, in 
any case, satisfy the equations Zi.a;|*a,;|'=1, and these equations cannot 
be satisfied by constants, in general. 

Applications to a sheaf of congruences. A totality of *-! congruences 
which has the property that each two congruences cut under a constant angle 
we shall call a sheaf of congruences. 


THEOREM 34. If E is an ennuple from a sheaf of congruences, E’ is an en- 
nuple from the sheaf if and only if the n® quantities s‘/ds'i are constants. 


Since E belongs to the sheaf, the angles w;;, 0;, 2;; pertaining to EZ are all 
constant. Evidently, the ith congruence of E’ belongs to the sheaf if and only 
if the angles y,; (r=1, 2, - - - , m) which it makes with the m congruences of E 
are constant. But this is the case, according to (79), if and only if the angles 
di, (r=1, 2, - - - , m) are constants, and hence, by (78), if and only if ds,/ds’* 
(r=1, 2, - - are constants. Thus, the theorem is proved. 

It follows that, if E and £’ are ennuples from the same sheaf, formulas 
(82) and (83) become 


| Os? Os? | | Os? Os? 

These relations between the curvature and spread vectors of E and those 
of E’ reflect the fact, evident from (80), that in a transformation from one 
ennuple of a sheaf to a second the Christoffel symbols behave like the compo- 
nents of tensors. 

The relations embody various results. To begin with, we note 


THEOREM 35. If one ennuple from a sheaf of congruences has constant dis- 
tantial spread vectors, and hence constant angular spread and curvature vectors, 
so has every ennuple from the sheaf. 


In particular, if one ennuple is an ennuple of Tchebycheff and therefore 
Cartesian, so is every ennuple. 
From (84) and (71) we conclude: 


= 


Hence the vector g‘ic;;|" is the same for every ennuple of the sheaf and is, 
then, in this sense, an invariant vector of the sheaf. In particular, if Z is 
orthogonal, 


= 


+ 
Ps 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 579 


Thus, the sum of the curvature vectors of an orthogonal ennuple of a sheaf is 
the same for every orthogonal ennuple of the sheaf.* 

Employing (84) and (71), we can construct other invariants of the sheaf. 
For example, the tensors 


are the same for every ennuple of the sheaf. 

When we multiply each of these tensors by g,; and sum over r, t, we ob- 
tain two scalar invariants of the sheaf. The values of these scalars for an 
orthogonal ennuple are >-(1/c;;), where 1/c;; and 1/b;; are the 
lengths of the vectors c,;| and b;;|. Thus, the sum of the squares of the lengths 
of all the curvature and angular spread vectors,{ and the sum of the squares 
of the lengths of all the distantial spread vectors, of an orthogonal ennuple 
of a sheaf are the same for every orthogonal ennuple of the sheaf. 

12. Inclusion of congruences of curves in families of surfaces. In §6 we 
found the conditions under which 7 linearly independent congruences lie in a 
family of r-dimensional surfaces. The purpose of this section is to treat the 
most general problem of this type, namely that of determining the family of 
surfaces of lowest dimensionality in which lie all the congruences of an arbi- 
trarily chosen set of congruences. 

The problem is not a simple one and much preliminary work is needed. 
To begin with, we shall show, by means of the following theorem, that we may 
restrict ourselves to sets of linearly independent congruences. 


THEOREM 36. If in a set of congruences there are r, and no more than r, 
linearly independent congruences and an arbitrarily chosen, but fixed, subset of r 
linearly independent congruences lies in a family of k-dimensional surfaces S; 
and in no family of surfaces of lower dimensionality, all the congruences of the 
set lie in the surfaces S, and in no family of surfaces of lower dimensionality. 


To establish the theorem, it suffices to prove that, if r linearly independent 
congruences lie in a family of k-dimensional surfaces S;,, any congruence 
which is a linear combination of them lies in the surfaces S;. But this propo- 
sition is easily established. 

The solution of the proposed problem is going to depend, not only on the 
distantial spread vectors of the given congruences, but also on the distantial 
spread vectors of the congruences determined by the distantial spread vectors 

* Bortolotti, Stelle di congruenze e parallelismo assoluto, Rendiconti della Accademia dei Lincei, 
(6), vol. 9 (1929), pp. 530-538, gives this theorem. He approaches the subject indirectly, through the 
study of a metric connection with absolute parallelism, and is concerned with invariants which are 


the same simply for every orthogonal ennuple, not for every ennuple, of the sheaf. 
} This result is also to be found in the paper by Bortolotti just cited. 


4 
* 
= 
f 
| 
4 
| 


580 W. C. GRAUSTEIN [July 


of the given congruences. Accordingly, we shall find it convenient to intro- 
duce the following terminology. 


DEFINITION 1. The distantial spread vector of the two congruences deter- 
mined by two ordered vector-fields shall be called the distantial spread vector of 
the two vector-fields, in the given order. 


We next prove a theorem for the distantial spread vectors of a given set 
of vector-fields analogous to Theorem 36 for congruences. 


THEOREM 37. If in a set of vector-fields there are r, and no more than r, 
linearly independent vector-fields, say V1, V2, V,, and if Vi, V2,---,V;, 
and their distantial spread vectors are linearly dependent on k linearly indepen- 
dent vector-fields Vi, V2,- ++, Vr, Vest, ++, Vie, then the distantial spread 
vector of any two vector-fields of the given set is a linear combination of 
Vi, Vo,- ++, Vie 


Let the vector-fields Vi, Vo, - - - , V, serve as the first r tangent vector- 
fields of an ennuple £, let r linearly independent linear combinations of 
Vi, Vo, - - +, V, serve as the first r tangent vector-fields of a second ennuple 
E’, and assume that the remaining tangent vector fields of E and E’ are 
identical. Then 


os? 


= 


Hence, by (83), 


Os? Os? 


(85) * = 


P.@ 


Without loss of generality we may assume that Vi, Vo, - - - , Vx are the first 
k tangent vector-fields of E. It follows, then, by hypothesis, that 5,,|‘=0 
for p,qg=1, 2, ---,r;t=k+1, - - -,n. Hence, §;;|* is a linear combination of 
Vi, Ve, - - -, Vx and the theorem follows. 

Suppose, now, that we have given a set, 7», of linearly independent vec- 
tors, that is, vector-fields. In solving the proposed problem for the con- 
gruences determined by the vectors of To, we shall have to consider all the 
vectors obtainable from those of JT) by repeated application of the process of 
finding distantial spread vectors. To systematize the repetition of this pro- 
cess, we adopt the following definition. 


DEFINITION 2. A distantial spread vector of To of order k(=1) shall be a 
distantial spread vector of two distantial spread vectors of To of orders lower 
than k, at least one of which is of order k—1. A distantial spread vector of To of 
order zero shall be a vector of To. 


| 

| 

| 

] 

z 
4 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 581 


The totality of distantial spread vectors of T of order z we shall denote 
by D; and the totality of those of orders 0,1, - - - , i we shall designate by 7;. 
Evidently, 


= 1). 


THEOREM 38. The set of vectors T;(i=1) consists of the vectors of To and 
the distantial spread vectors of the vectors of T i. 


The theorem follows directly from the definition. 

For our purposes, the essential aspect of the set of vectors T; is the maxi- 
mum number of linearly independent vectors contained in T;. We shall call this 
the dimension number of T; and denote it by n;. Inasmuch as 7; contains T;-,, 
it is clear that n;=n:1, 721. 

The sequence mo, m, M2, - + - we Shall refer to as the sequence of dimension 
numbers of To. Corresponding to it we may choose, in various ways, a se- 
quence of sets of vectors 


such that Vi,---, V,, are ; linearly independent vectors in T;. From the 
definition of ;, it is evident that all the vectors of 7; are linear combinations 
of Vi,---,Vn, and that the vectors, Va, 41,°-°-°, Vn, of the (¢+1)st set 
of the sequence belong to Dj. 

Since m; can never exceed m, the sequence (86) is finite. We can, however, 
say more than this. It is true that, if a certain group in the sequence is 
empty, all succeeding groups are empty. In other words: 


THEOREM 39. If two consecutive numbers in the sequence of dimension num- 
bers of To are equal, all subsequent ones are equal to them: 


Ny <M < < < = Mey SN (} = 1,2,---). 


We are to prove that, if then Since the 
vectors of T;4:, as well as the vectors of T;, are linearly dependent on 
Vi,---+, Vn, The distantial spread vectors of Vi, - - - , Vn,, since they be- 
long, by Theorem 38, to T44:, are then linearly dependent on Vi, - - - , Vi,. 
Hence, by Theorem 37, the distantial spread vectors of all the vectors of 
Tx4: are linearly dependent on V;,---, Vn,. But these distantial spread 
vectors, together with the vectors of To, are precisely the vectors of Ti4:. 
Thus the vectors of 7442 are linear combinations of Vi, - - - , V,,, and there- 
fore Mk. 

By a reduced set of a given set of vectors we shall mean a set of linearly 
independent vectors of the given set on which all the vectors of the given 
set are linearly dependent; and by a reduced sequence of the sequence Do, 


¥ 
4 
i 
| 
SH 
x 
4 
| 
| 
4 


582 W. C. GRAUSTEIN [July 


D,, D2, of successive distantial spread vectors of T> we shall mean a 
sequence of sets of vectors Dj, D/, D/, - - - such that the vectors of T/, 
where 7/ =7T/_1+D/, constitute a reduced set of the set of vectors T;. 

The sequence of groups of vectors (86) is a reduced sequence of the se- 
quence Do, D,, D2, - - - . But this sequence is perhaps lacking in an important 
property of the sequence Do, D,, D2, - - - , namely, the property that every 
vector in the set D; is a distantial spread vector of two vectors belonging to 
preceding sets. We shall call this the property of cohesion. 

It is conceivable that a sequence of successive distantial spread vectors 
of J, cannot be rendered both reduced and cohesive. That this is not the 
case we shall prove by actually defining a sequence Dj, D/, D/, - + + which 
has both properties. 


DEFINITION 3. The vectors of Dé shall be identical with those of Do (or T»). 
The vectors of D} (i=1) shall be chosen from the distantial spread vectors of the 
vectors of T/_; so that the vectors of T! constitute a reduced set for the vectors of To 
and the distantial spread vectors of the vectors of Tj-1. 


It is evident that the sequence D/ , D/ , D/, - - - thus defined is cohesive. 
To show that it is a reduced sequence of Do, D,, Ds, - - - , we must prove 
(a) that D/ is a subset of D;, and (b) that the dimension number of T/ is 
the same as that of 7;. 

A. D} is a subset of D;. It is obvious from the definition that the theorem 
is true for i=0, 1. Suppose that 7=2. It follows from the definition that the 
distantial spread vectors of the vectors of T/_:, from which the vectors of D/ 
are chosen, are distantial spread vectors of 7» of orders not greater than 7. 
Hence, if a vector of D/ does not belong to D,, it is a distantial spread vector 
of T) of order less than 7, and so is included among the distantial spread vec- 
tors of vectors of T/_,. But these are, by definition, linear combinations of the 
vectors of T/_;. Thus, the vector of D/ in question is linearly dependent on 
the vectors of T/_,, and this contradicts the demand that the vectors of 
T/ =T/1+D! be linearly independent. 

Since D/ is a subset of D;, T/ is a subset of T;,i=0, 1,2, - - - . According 
to the definition, the vectors of T/ are linearly dependent. It remains then 
to prove that the number of them, 7/, is equal to the maximum number, nj, 
of linearly independent vectors in 7;. 

B. n{ =n;. Inasmuch as nj =o, it suffices to show that, if n{ =n;, then 
Nis1=Nis1. Since n{ =n;, the vectors of T; are linear combinations of the 
vectors of T/. But the distantial spread vectors of the vectors of Tj are 
linearly dependent on the vectors of T/,,=7/ +D/,:. Hence, by Theorem 37, 
the distantial spread vectors of the vectors of 7; are linearly dependent on 


¥ 
Fa 
‘a 
] 
| 
5 
aS 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 583 


the vectors of T/,;. But this means, according to Theorem 38, that the vec- 
tors of 7,4; are linear combinations of the vectors of T/,:, and hence n/4,;= 
Ni+1- 

The result thus established may be stated as follows. 

THEOREM 40. Any sequence of groups of vectors formed as described in Defi- 
nition 3 is a cohesive, reduced sequence of the sequence Do, D,, D2, +--+ of the 
successive distantial spread vectors of To. 

We return now to the proposed problem, restricting ourselves, as is per- 
mitted by Theorem 36, to a set of linearly independent congruences. 

THEOREM 41. If m is the largest number in the sequence of dimension num- 
bers of the set, To, of vector-fields tangent to r linearly independent congruences, 
the r congruences lie in a family of m-dimensional surfaces and in no family 
of surfaces of lower dimensionality. 


By hypothesis, 
(87) (1 =1,2,---). 
Without loss of generality, we may take the given congruences as the 
first m) congruences of an ennuple Z. The congruences lie, then, in the family 
of hypersurfaces ¢=const. if and only if 


(88) 0 L 1, 2, mo), 


and this system of equations is compatible if and only if 


(89) (i i,j = 1, 2,-+-+, mo). 


The vectors b;;|*=B;}, i, 7=1, 2, - - - , mo, are the distantial spread vec- 
tors of the vectors of To, or Tj’. From them we choose, according to the pre- 
scriptions of Definition 3, the vectors of D/. In T{ =Ti' +D/ we have then 
m, linearly independent vectors, on which all the distantial spread vectors 
of T¢ are linearly dependent. 

The vectors of T/ are the first m tangent vectors of EZ and those of D/ 
may be thought of as the next m,—m tangent vectors of E. Equations (89) 
are, then, equivalent to the equations 0¢/ds*=0, k=mo+1,---, m, and, 
when adjoined to equations (88), give rise to the extended system of equa- 
tions 


(90) m). 


© 

4 

anf 

3 

Sa 

> 

M 


584 W. C. GRAUSTEIN [July 


The procedure now repeats itself. The conditions of compatibility of (90), 
namely, 


dg 
(91) (i = 1, 2,°°*, mM), 
AY 


k=n,+1 


involve the distantial spread vectors of the vectors of 7/. From these we 
choose, following Definition 3, the vectors of D/ , thus obtaining the m2 lin- 
early independent vectors, 7/7 =7/ +Dy/, on which the vectors involved in 
(91) are linearly dependent. Assuming that the vectors of D/ are the “next” 
n2—mn, tangent vectors of E, we find, then, that equations (90) and (91) may 
be replaced by the equivalent system 0¢/0s‘=0,i=1, 2,-- -, me. 

After this procedure has been carried out p times, we obtain the system 
of equations 


(92) L Np), 


as necessary and sufficient condition that the family of hypersurfaces 
¢=const. contain the r given congruences. Since, by (87), 7,=m, this system 
of equations is completely integrable. For the distantial spread vectors of the 
vectors of JT,’ are, by hypothesis, linear combinations of the vectors of T,, 
and since these are the first », vectors of E, B;* =0 for i, 7=1, 2,---, mp, 
k=Mpy4i, , m, SO that the conditions of integrability are identically satis- 
fied. 

It follows, now, that the r given congruences lie in the family of m-dimen- 
sional surfaces defined by »—m functionally independent solutions of the 
system of equations (92), and in no family of surfaces of lower dimensionality. 

The fact that equations (92) are completely integrable means that the 
first 2, (=m) congruences of £ lie in a family of m-dimensional surfaces. The 
tangent vectors of these congruences are precisely the vectors of T, and, 
by hypothesis, 7, is a reduced set of vectors for the set consisting of To 
and the distantial spread vectors of 7, of all orders. Hence: 


THEOREM 42. A necessary and sufficient condition that r linearly independ- 
ent congruences lie in a family of m-dimensional surfaces and in no family of 
surfaces of lower dimensionality is that m be the minimum number of linearly 
independent vectors on which the tangent vectors of the congruences and their 
distantial spread vectors of all orders are linearly dependent. The curves of m 
congruences whose tangent vectors constitute such a set of linearly independent 
vectors lie, then, in a family of m-dimensional surfaces, and it is this family of 
surfaces in which the given congruences are contained. 


4 
24 
> 


1934] THE GEOMETRY OF RIEMANNIAN SPACES 585 


A pplication to nonholonomic spaces.* Let there be given in V, a system 
of n—r (r>1) linearly independent total differential equations 


(93) Ai,dxi = 0 


where A‘; are functions of x!, x”, - - - , x". 

If the system has no integral whatsoever, it is said to represent in V, a 
single r-dimensional nonholonomic space V;. 

In case the system has precisely n—m (m2=r) independent integrals, we 
shall say that it represents a nonholonomic manifold V, which lies in a family 
of m-dimensional (metric) spaces. Actually, the system represents in this case 
co-™ single nonholonomic spaces Vj, one in each of the m-dimensional 
(metric) spaces determined by the m—™m integrals. In particular, if the sys- 
tem is completely integrable (m=r), the »*-* nonholonomic spaces are the 
«-r (metric) space themselves. 

We proceed to show how the discussion of the nonholonomic manifold 
(93) may be brought within the scope of our general theory and to deduce 
geometric conditions that the manifold lie in a family of m-dimensional (met- 
ric) spaces. 

We think of the coefficients A‘; in (93) as defining m—r linearly inde- 
pendent covariant vector-fields Ai;,,i=r+1, ---,m, and select r other co- 
variant vector-fields A‘;, i=1, 2,---, 7, so that the m fields A‘;, i=1, 
2,---+,m, are linearly independent. We then set d‘;=p‘A‘;, choosing the n 
functions p‘ so that the m fields of contravariant vectors d,/ which are con- 
jugate to the fields of covariant vectors 4‘; consist of unit vectors. Thereby, 
we obtain in V, an ennuple of congruences, E, with reference to which the 
equations (93) of the nonholonomic manifold take the form 


(94) dst = a‘ = 0 (j=r+1,---,n). 


It follows that the congruences of curves which lie in the nonholonomic 
manifold are precisely the congruences which are linearly dependent on the 
first r congruences of the ennuple Z. Furthermore, ¢ is an integral of (94) if 
and only if d¢/ds‘=0, i1=1, 2,---, 7, that is, if and only if the family of 
hypersurfaces ¢ = const. contains the first r congruences of EZ. Hence, the con- 
ditions of Theorem 42, applied to these r congruences, are precisely the con- 
ditions under which the nonholonomic manifold lies in a family of m-dimen- 
sional (metric) spaces. 


* For an extended treatment of nonholonomic spaces, see Vranceanu, Studio geometrico dei 
sistemi anolonomi, Annali di Matematica, (4), vol. 6 (1928), pp. 9-43. 


Harvarp UNIVERSITY, 
CAMBRIDGE, MAss. 


| 

] 

] 

| 

j 


SOME POINTS IN THE THEORY OF TRIGONOMETRIC 
AND POWER SERIES* 


BY 
ANTONI ZYGMUND 


I. ON THE CHARACTER OF OSCILLATION OF THE PARTIAL SUMS 
OF FOURIER SERIES 


1. The fundamental theorem. Completing a well known result of Kolmog- 
oroff [12], Marcinkiewicz [15] has recently constructed a function integrable 
L, whose Fourier series possesses partial sums oscillating finitely almost every- 
where. It is, therefore, natural to ask what may be said about the relative 
position of the interval of oscillation of s,(x) and the value f(x), beyond the 
well known fact that the said interval contains f(x). The result proved in this 
note is a first attempt in this direction. 


THEOREM. If the partial sums s,(x) of the Fourier series of a function f(x) 
integrable L, 


1 2 
(1) f(x) ~ 7 ao + >> (a, cos nx + b, sin nx), 
n=l 
satisfy an inequality 
(2) Sn(x) 2 — (0S =0, i, 


with (x) 20 integrable L, then, for almost every x, 


(3) f(x) = =| tim sup S,(x) + lim inf saa) | 


no 


2. Statement of lemmas. The proof of our theorem is based on three 
lemmas which will be stated in this section and the proof of which will be 
given in the next section. Let o,(x), n(x), Sa(x) denote, respectively, the 
first arithmetic means of the series (1), of the conjugate series 


> (an sin nx — b, cos nx), 


n=1 


and the partial sums of the latter series. We have [20, 24, 18, 30] 


* Presented to the Society, March 31, 1934; received by the editors November 5, 1933. The 
six notes constituting this paper are independent of one another, although they treat related topics. 
The numbers in square brackets refer to the Bibliography at the end of the paper. 


586 


2 
x 
q 
% 
4, 
3 
4 


TRIGONOMETRIC AND POWER SERIES 
Sn(X) — on(x) = (x)/(m + 1), 
Sn (x) 


n+1 


2 
= =f Sn(u) cos (m + 1)(u — x)K,(u — x)du, 


sin u 


is the well known Fejér’s kernel. 


Lema 1. Let H be a measurable set contained in (—7, 7) and having x =0 
as a point of zero density. Then the function 


(5) =f ~ 
satisfies the relations* 
(6) Lal) 
(7) L,(t) = o(n) (uniformly in #). 

Lemma 2. Under the hypothesis of the theorem we have 
(8) | — on(x)| S 

n+1 

where t,(x) are the Fejér means of an integrable function (x) 20. 

Lemma 3. If, under the hypothesis of the theorem, we have for every x be- 
longing to a set E of positive measure 
(9) Sn(X) — on(x) +1) 2—M (n = m, M > 0), 
then, for almost every x in E, 


(10) lim sup [sn(x) on(x) | <M. 

3. Proof of lemmas. To prove Lemma 1, let Q,(u) =n/(1+n7u?). Since 
the kernel K,(u) is O(n) for OSuS1/n, and O(1/(nu?)) for 1/n<u<3r/2, 
it is easy to see that K,,(u) <CQ,(u) (cf. Fejér [3]). From (5) we deduce that 


* In the following we use C as a generic notation for an absolute constant, not necessarily the 
same in all formulas where it occurs. 


587 

(4) 

‘ where 

2 

: 

4 

3 

4 

7 


ANTONI ZYGMUND 


L,(t) < f " Ka(u)Kn(u — 


Since the integral is an even function of ¢, we may restrict ourselves to the 
values 0<¢<7. Break up the integral into four, extended over the intervals 
—2/2), (—2/2, 0), (0, ¢/2), (¢/2, and denoted respectively by 
U,(t), U,(), U,©(@), Un Since Q,(u) is decreasing in the interval 
0<u<3r/2, it is readily seen that 


U () = o(n-) f K,(u — t)du = O(n-") S CO, (2), 


0 


UX (t) CQ,(4) K,(u)du CQ,(2), 


—1/2 


t/2 
(t) < CO,(t/2) f K,(u)du < 
0 


t/2 
Adding these inequalities together we obtain (6). 

To obtain (7), it is sufficient to replace in the integral (5) the function 
K,(u—t) by its upper bound (which is O(m)) and to notice that the remaining 
integral represents the Fejér means, at x =0, of the characteristic function of 
the set H, and so, by Lebesgue’s well known criterion, tends to 0 with 1/n. 

To prove Lemma 2, replace, in the right-hand side of (4), s,(u) by 
Sn(u) +¢(u) —d(u). Then in view of (2), the first term in (8) does not exceed 


2 2 
=f [sn(u) + o(u)|K,(u — x)du + o(u)K,(u — x)du 


1 
= —f v(u)K,(u — x)du 
where ¥(u) = 2f(u) +4¢(u). 
We pass on to the proof of Lemma 3. First, we have, for almost every x 
in the interval (—7, 7), the relation 


2 
-—f [sn(u) —on(u)+M] cos 


For we may replace s,(u) by s,(u)+M under the sign of integral in (4), with- 
out changing its value. If we replace there s, by o,, we obtain ¢,/ /(n+1), 
which represents the difference, multiplied by (7+2)/(m+1), of the first and 
second arithmetic means of the series (1). This follows from the formula 


4 
A 


¥ 

(a 4 
n+1 4 


TRIGONOMETRIC AND POWER SERIES 


(2) ve(t—v+i1) a(x) 


where c, is the general term of series (1) and 


k 


This difference obviously tends to zero for almost every x 
Let now 0<r<1 be fixed and let 


PAu) 


p 2 
n+1 
Gn (x) 


= — on(u)|K,(u — x)du, 


we have 


(12) f — on(u)|K,(u — x)du—0 


“3 
& 


for almost every x, and 


Sn (x) 
or 


[s,(u) — o,(u)+M | 
x [—3+r cos (n+1)(u—x)]K,(u—x)du+o(1) 


Break up the last integral into two, extended over E and its complement H, 
and denote the corresponding expressions by J; and Jz. P,(u) and K,(u) are 
non-negative, hence by (9), J: <0 and it remains to show that J;—0 almost 
everywhere in E£. It is sufficient to show that J2—0 at every point x where E 
has density 1 and where the integral of y (see Lemma 2) has a finite deriva- 
tive. Suppose for simplicity that « =0 is such a point and let M,=max P,(u), 
0<u<2rz. Then (see Lemma 2) 


2 2 
(13) | < =a, f [rn(u) + M|K,(u)du = <u, f Tn(u)K,(u)du + o(1). 
H H 


1934] 589 
| 
| 
| Since 
n 
| 
| 


590 ANTONI ZYGMUND [July 


Expressing 7,() as a Fejér’s integral and interchanging the order of integra- 
tion, we see that the integral of the oa member of (13) is equal to 


<M, (t)dt = J ) 


where L,(¢) is given by formula (5). Let 8 be a fixed positive number. We 
have 


Bin 
14 L,(é)dt = 
(14) J +f" 
Let 


V(t) = V(t) yt (t = 0), 
0 


y being a constant (the inequality is implied by the fact of existence of a 
finite derivative of ¥(#) at ‘=0). By (7) the first integral on the right in (14) 
is o(n)¥(8/n) =0(1). The second integral is less than 


Cc V(r) 


B/n B/n 


2Cy (* 
S o(1) + — t*dt S « (n = no(e)), 

Bin 
where e>0 is arbitrarily small, if only 6 is sufficiently large. An analogous dis- 
cussion is applied to the integral f_.y(é)L,(é)dt. It follows that J.—0 for 
every 0<r<1. Since we may take r as near to 1 as we please, the truth of the 


lemma follows. 
4. Proof of the theorem. Let now F and G denote the sets of points at 


which, respectively, 
lim sup [s,(«) — f(x)] > lim sup [f(x) — s,(x)], 


no 


(15) lim sup [s,(x) — f(x)] < lim sup [f(x) — s,(x)]. 


To prove that the set F is of measure 0, it is enough to show that the set F; 
of points for which 


lim sup [s,(x) — on(x)] > lim sup [o,(x) — sa(x)] 
is of measure 0. If it were not, we could find two numbers N>M>0 and a 
set F; F,, meas F,>0, such that 


(16) lien 1 Sup [sn(x) — o,(x)] > N > M > lim sup [o,(x) — sa(x)]. 


4 

no 

= 


& 


1934] j TRIGONOMETRIC AND POWER SERIES 591 


From the last inequality we conclude the existence of an integer mo and of a 
set Ec F:, meas E>0, such that 


on(x) — s,(x) S M, n> No, xc E£, 
and hence, by Lemma 3, we have 


lim sup [s,(x) — o,(x)] S$ M<N 


almost everywhere in £, contrary to the first of inequalities (16). The theorem 
is, therefore, established. 

5. Additional remarks. (i) Under the hypothesis of our theorem we may 
prove also that the relation 


1 
(17) f(x) = = E sup S,(x) + lim inf ia) |, 
where f(x) is the function conjugate to f(x), holds almost everywhere in 
(—7, 7). The proof is exactly the same as before, except that, instead of (4), 
we use the formula 


Sn (x) 
n+1 


(ii) It is not difficult to see that the results above may be localized; if 
we suppose that (2) is satisfied in an interval (a, 6), the relations (3) and (17) 
are true almost everywhere in (a, b). This follows from general localization 
theorems for trigonometric series. 

(iii) The hypothesis that the trigonometric series considered in the the- 
orem is a Fourier-Lebesgue series is superfluous and may be omitted. In fact, 
inequality (2) implies that the sequence { /™_ | s,(x)|dx} is bounded, and so 
the series (1) is a Fourier-Stieltjes series. The arguments which we have used 
in the proof may be, without any difficulty, adapted to this new, slightly 
more general, case. (See for instance [30].) 


on(x) — Sa(x) = = = f satu) sin (n + 1)(u — x)K,(u — x)du. 


II. ON THE ABSOLUTE CONVERGENCE OF FOURIER SERIES 


1. It has been proved that if f(x), O<x*x<2z, is a periodic function of 
bounded variation, satisfying a Lipschitz condition of positive order, the 
Fourier series of f(x) converges absolutely [28, 8]. 

In the same way it is possible to prove the following, more precise, theo- 
rem [26].* 


* For a similar problem see also O. Szdsz [23]. 


3 
| 
4 
2 
3 q 
a 
4 
4 
4 
4 i 
ra 
é 


592 ANTONI ZYGMUND [July 


THEOREM 1. If f(x) is of bounded variation and satisfies a Lipschitz condi- 
tion of order a, and if 


(1) f(x) ~= + >> (a, cos nx + b, sin nx), 


n=1 


then the series 


(2) = On? + 


converges for every k>2/(2+a). 


The main purpose of this note is to show that the condition imposed on 
k is the best possible. More precisely, we may state 


THEOREM 2. For every value 0<a<1 there exists a function of bounded 
variation, satisfying a Lipschitz condition of order a, and such that the series 
(2) diverges when k =2/(2+<a). 

2. For the sake of completeness we begin by proving Theorem 1. 

Let N be a positive integer and j=1, 2, - - - , 2N. By Parseval’s identity, 


Let V denote the absolute variation of f(x) in the interval (0, 27) and w(6) 
the modulus of continuity of f(x), i.e., w(6) =max| f(x:) —f(x2)| for 
< 6. In our case w(5) =O(6*). 

For every x we have the inequality 


Integrating this over the range (0, 27), and taking into account (3), we get 
successively 
2”-1 


bed nr nr 
2N sin? = O(N-*), sin? — = N = 2°, 
> p > p ) 


n=l n=2?—1 


Pa? = , 


n=?—1 


1934] TRIGONOMETRIC AND POWER SERIES 593 


We may suppose that k<2, the convergence of }\p,? being obvious. Then, 
by Hélder’s inequality, 


2-1 
by = | 


It follows that the series 
co 
n=1 vel 
converges, if only k>2/(a+2). 

3. To prove Theorem 2 we shall consider power series of the form 
>, exp (2rin)2", where 0<a<1 and 5, are real and very regularly tend 
to zero. We shall study these series by means of the following lemmas due 
to van der Corput.* 


Lemma 1. Let a(u) be a real function of u, | a'(u)| <1—5. Then 
> exp 2ria(v) — exp 2ria(u)du| < As. 
a<rSp a 
where A; depends only on 56. 
Lemma 2. Let a’(u) be positive and decreasing. Then} 
C 
J exp 2ria(u) du} . 
Lemna 3. Let a’’(u) —p<0. Then 


3 
f exp 2ria(u) du| S Cp~/?, 


The following two propositions are (using Abel’s transformation) im- 
mediate corollaries of Lemmas 1 and 2. 


Lemma 4. If (i) a(u)—©, (ii) a’(u) decreases monotonically to zero, (iii) 
b,—0, (iv) Ab,| <0, then the series 


(5) > exp 2ri[a(n) + n0] 
converges uniformly on every arc 51-56. 
* We take these lemmas in the form stated by Hille [10, 11]. In [10] several bibliographical refer- 


ences are given. 
Tt See footnote on p. 587. 


3 
3 
4 
4 
4 
i 
3 
q 
i 
i 
a 
4 


594 ANTONI ZYGMUND [July 


Lemma 5. [f (iii) and (iv) in Lemma 4 are replaced respectively by (iii’) 
b,/a’(n)-90, (iv’) Ad, | /a’(m) then the series (5) converges for every 
value of 0 although not necessarily uniformly (in fact it converges uniformly over 
every interval 951-5). 

4. We shall now prove 

THEOREM 3. If 0<a<1,8>0, the function 


(6) f7*(0) = > exp 2ri(n* + 


n=1 
which is continuous in every interval 6S0<1—5, satisfies the following in- 
equalities: 
=} O(log|@|), a+tB=1} 
O(1), 
9 lg + B<1 


as@—0-. 
O(1), ga + p21 


= 


52 = exp 2ri(ve + v0). 
Put a(u) =u*+u0 and assume |@| <3. From Lemma 2 it follows that 


(7) = 2ria(u) du + O(1), 


and, by Lemma 3,* 


exp 2ria(u) du| An'-@/2, 
1 


Hence 
For subsequent discussion we need more precise estimates of S*:°(6). To ob- 


tain them the cases 0<@<}3, —}<0<0O have to be treated separately. In the 
first case we have, by Lemma 2, 


if exp 2ria(u) du| < 
1 


* We use A as a general notation of a constant which does not depend on 8. 


Let a 
n 
Se 
Ly, 
A A x 
5 
ant +9 6 
_ 
; 


1934] TRIGONOMETRIC AND POWER SERIES 


whence 
(9) | S*"°(6)| < Ant-« 


A 
(10) | s*°@)| 


In the second case we put t= —@, 


(11) a(u)=u*—tu, a’(u)=au!—t, O<is i, 


Now if n= N,, again by Lemma 2, 


| f exp 2ria(u) du 
1 


and again 

(13) | S*"°(6) | < An'-«, 

By (8), 

(14) | < < A | Ni <n Nz. 


Finally, when n> we write 
n N; 
f exp 2ri(ut — tu)du = f 
1 1 Ne 


The first integral on the right gives the same contribution as (14) while the 
second integral can be estimated by Lemma 2 (with an obvious modification). 
Thus we get 


2ri( tu)du| < 
exp 2ri(u* — tu)du | < —————— s — 
Ne t — 
and 


(15) | < | , n> No. 


Now, by Abel’s partial summation, 


n—1 


595 

A 

< ——— < | 
2 = => 

ane! — 

3 
3 


596 ANTONI ZYGMUND [July 


For any fixed value of 60, |6| <3, the second term of the right-hand 
member tends to 0 as n—0. Hence 


(16) f«-8(9) = lim S*"(0) = <|0| 
val 


We write 
M 
v=N 
By (9) and (13) 
Ni Ni 
|P| SAD = AD 
1 v=1 
Hence, for 0<|6| <3, 
= O(| <1, 
P = {O(log = O(log| ), ifa+ =1, 
O(1), ifa+fB>1. 
At this juncture we have again to distinguish between the cases 0<@<4 
and —}<6<0. In the first case we apply (10) which gives 
1 
Q= o(— > = = 
0 +1 


Being combined with the estimates above for P this furnishes the proof of 
the first part of Theorem 3. 
When we write 


N: 


and apply (14) and (15). This yields 
Ne 
| R| <A = 4 


It follows that 


a 
= O(| if 8 + <i, 


=e = 1 — = 
2 


a 
o(1), BP + 


i 
| 
2 
| 


1934] TRIGONOMETRIC AND POWER SERIES 


Finally, S is readily estimated by using (15) which gives 


$= o( > | g ) 


On combining these results we obtain a proof of the second part of Theorem 3. 

Theorem 3 shows that the behavior of f**(@) in the neighborhood of 
6=0 is different for 00+ and for 6—0-. In the interval 0 <6@ <4 the function 
f**(0) is always integrable. In the interval —}<0@<0 we are sure of integra- 
bility only if 8B >a/2. If 8=a/2 we get only f**(@) =O(|6|—") and the func- 
tion is probably not integrable.* It will be integrable if we introduce ad- 
ditional logarithms, as is shown in the next 


THEOREM 4. The sum of the series 


(17) > n-«!2(log exp 2ri(n= + y¥>1, 

n=2 
is O(|6|— log(1/|6|)), and, consequently, the series is the Fourier series of 
its sum. 


Although this theorem is important for our purposes, the proof need not 
be gone into, as it is essentially the same as that of Theorem 3. 

5. We now prove 

THEorEM 5. If 1<a/2+8<2, 0<a<1, B>0, the function f= satisfies 
a Lipschitz condition of order a/2+6—1. 

The case a/2+8=1 is contained in Theorem 3 and the other extreme 


case is a corollary of it. If 1<a/2+ <2 it follows from Lemma 5 that the 
series is everywhere convergent. Using (16) we have, with N = [1/|h| ], 


-+ > = P+Q. 


ymN+1 
From (8) it follows that 


* It is certainly not integrable if 8<a/2, for otherwise the series 2n-!~8 exp 24i(n*-+-n6) would 
be the Fourier series of a function of bounded variation (indeed absolutely continuous) satisfying a 
Lipschitz condition of order a/2+8 (see Theorem 5) and, by Theorem 1, its exponent of convergence 
would be $2/(a/2+8+2), which is easily seen to be impossible. It is, however, obvious that for any 
a, 8>0, the series (6) are Fourier-Riemann series. 

ft The same argument gives a more general result concerning the functions we obtain by intro- 
ducing logarithms into the denominator of the series (6). 


397 
4 
| 
v=] 
N = 
q 4 
a 
¥ 
2 


ANTONI ZYGMUND 


Av-*(| + | +| |) = o( 


= O( | h 
On the other hand we have 


By the well known theorem of S. Bernstein, if 7,,(0) is any trigonometric 
polynomial of order m, and if |7,(@)| <M, then | 7, (6)| < Mn. In view of 
(8) this yields at once 


d 
| — $2.0(g) | Ap?-2/3, 
dé 
whence 
N 
| P | ini o(| h | h | N2-4/2-8) = 0(| h 
vol 


Theorem 5 is thus proved. 


THEOREM 6. If 1<a/2+8<2, 0<a<1, B>0, y>0, then the sum of the 
series 


(18) = n)~ exp 2mi(n* + n0) 
n=2 
satisfies a Lipschitz condition of order 3a+8—1. 
The proof is the same as in the case of Theorem 5.* 


THEOREM 7. The series 


exp 2ri(n* n@), 0O<a< 7 > 
n=2 


is the Fourier series of a function of bounded variation satisfying a Lipschitz 
condition of order a, and if y is sufficiently near to 1, its coefficients c, have the 
property that >>| cn |* diverges for k=2/(a+2). 


This follows from Theorems 4 and 6. Theorem 2 now follows from Theo- 
rem 7. 


* We may also deduce Theorem 6 from Theorem 5 if we take into account that (18) isa “Faltung” 
of (6) and Z(log )~” cos 2xn@ which is a Fourier-Lebesgue series for every y>0. It is easy to see that 
the modulus of continuity of (6) will be preserved. 


598 [July 
vel 

| 

| 

| 

7 | 

] 

| 


TRIGONOMETRIC AND POWER SERIES 


III. ON A THEOREM OF FEJER AND RIESz 
1. The following result has been obtained by Fejér and Riesz [4]. 
THEOREM 1. Every analytic function f(z) regular for |z| <1 satisfies the 
inequality 
1 
(1) flrorlals— f 
D 2/¢ 


where C denotes the circle |z|=1 and D is its arbitrary diameter. 


It is well known that it suffices to prove the inequality (1,) for any special 
value of »; the general result then follows by a familiar argument.* Fejér 
and Riesz started with the case p=2. An alternative proof of (1,) which is 
given below begins with =1. This proof is based on the following 


Lemma. Let u(z) and (2) be conjugate, not necessarily real,t harmonic func- 
tions such that v(0) =0 and that f(z) =u(z)+iv(z) is regular for |z| <1. Then, 
with the same notations as before, 


v(z) 1 
(2) f jas| | mc || asl. 
On setting 


r sin t 


Q-(t) = > sin nt = 


z= rev, 
1 — 2rcost + r? 


v(re®) = u(e**)O,(t — 6)dt. 


Without loss of generality we may assume that the diameter D is the seg- 
ment (—1, 1) of the X-axis. Then 


Xs) 
f [| o(r)| +] 7) ars | | de, 


* It is well known that the condition of f(z) being regular on C is not necessary and may be re- 
placed by less stringent conditions. In the proof of Theorem 2 below we shall use the inequality (1,) 
under the assumption that 9(f) is continuous for || <1. 

+ A complex harmonic function »(z)=2(z)+702(z) is said to be conjugate to u:(z)+-iu.(z) if 
0,(z) is conjugate to m#(z), and 22(z) to u(z). 


1934] 599 

| | 
| 

we have 

| 


600 ANTONI ZYGMUND 


where M is the upper bound, with respect to a, of 
1 1 
—f [| | +] + x) | dr. 
0 


Suppose, as we may, that 0<a<rz. On setting sin a=h, cos a=k, we see 
that the last integral is equal to 


h lf dr +f dr |- 1 
ldo o 2° 


Our lemma is thus established. t 
Now we notice that if g(z) is analytic, the function —ig(z) is conjugate to 
g(z). Consequently, applying our lemma to the functions u(z) =2f(z) and 
v(z) = —izf(z), we get the inequality (1:) and, hence, the whole Theorem 1. 
2. We now prove 


THEOREM 2. Let f(z) =u(z)+iv(z) be regular for |z| <1, where u and » are 
real and 0(0) =0. There exists a constant A , depending only on p, and uniformly 
bounded in every interval 1< pS po, such that 


A preliminary remark is worth making. It has been proved by M. Riesz 
[22] that for any p>1 we have 


2r 
aos at, "| |» ao, 
0 0 


where M, depends only on #, and so 


1 
f s > f lrlas| 
p-1 |p P\lid 


< 2>-"(M,+ | u(t) || 
Cc 


+ The constant 4 in (2) cannot be improved for, otherwise, we could improve the inequality 
(1,), which is known to be impossible (Fejér and Riesz, loc. cit.). Another example is given by the 
pair of conjugate functions u(Rz) and »(Rz), where 

1 
2) = P,(@) = — ———————,_ = Q,(@), 
(0) 2 (2) = 


and R is sufficiently near to 1. 


[July 
(4) 
4 


1934] TRIGONOMETRIC AND POWER SERIES 601 


However we cannot put A}=2?-1(M,+1), since M, is known to be un- 
bounded in the neighborhood of p = 1, so that Theorem 2 is not a consequence 
of (1,) and of M. Riesz’s theorems on conjugate functions, although every 
single inequality (3,), for p>1, is such a consequence. 

Assume again for simplicity that D is the interval (—1, 1) of the X-axis. 
To any continuous function u(e) defined on |z| =1, there corresponds a 
function v(z) =7T{u}, conjugate to the Poisson integral of u(e#), defined for 
—1<r<1. The functional »=T{u} is additive and the inequality (3,) is 
certainly true for p=1 and p=o. By a theorem of M. Riesz [21], the upper 
bound (with respect to all continuous functions) of the ratio 


(f | o(r) |? in) (f "| we ao) 


is a convex function of 1/p, p=1. Hence, if A, denotes the smallest possible 
value for which (3,) is true and if 1< p<», the number A, does not exceed 
max (A,, A,j,). 


THEOREM 3. Under the conditions of Theorem 2 we also have 


(,) Plats f 


where A, is a constant analogous to, but not necessarily the same as, the constant 
A, of Theorem 2. 


In fact, if ao=f(0), we find, arguing as above, that 


v(z) |? f(z) — a |? 
d | 


< 2>-(M,+ nf | u(z) — ao|? 
Cc 


since 


| aol? =| (= | ao)” | 


The rest of the proof is the same. 
3. Additional remarks. (i) The function u(z) = P,(@) shows that Theorem 
2 is false if in the left-hand member of the inequality (3) we replace v by u, 
but, of course, the new inequality is true if 1+¢e<p< po, for every e>0. 
(ii) Let «(z) be real and harmonic for | z| <1. Applying the Lemma to the 


c 
| 
* 


602 ANTONI ZYGMUND [July 


conjugate functions 0u/00 and —rdu/dr we obtain the following result: If a 
function u(e*) is of bounded variation, the corresponding harmonic function 
defined by Poisson’s integral is of (uniformly) bounded variation on any 
radius (cf. Prasad [19] where a more general result is proved). 

(iii) Theorem 2 is probably false for any 0 <p <1. It is certainly false e.g. 
for p=4, as the example of conjugate functions dP,(@)/d@ and dQ,(@)/dé 
shows (see footnote on page 600). 


IV. ON A THEOREM ON CONJUGATE FUNCTIONS 


1. The following is one of the several definitions of an integral given by 
Denjoy [2]. 

A function f(x) defined for a<%<b and continued outside (a, b) by the 
condition of periodicity is said to be integrable B on (a, d) if, for an arbitrary 
subdivision a@=a)<a,<a2< --- <a,=b and arbitrary set of values &,, 
S&;Sa;, the expression 
(1) = DIG + O(a: — a) 

t=1 
tends in measure to a limit J, when max (a;—a;_;)—>0.t J is then called the 
value of the integral of f over (a, b). 

It is not difficult to grasp the meaning of the above definition. Instead 
of one (periodic) function f(x), we consider the whole family f,(x) derived 
from f(x) by translating the argument x by ¢ and construct for each of them 
the Riemannian approximating sums. Even if the function f(x) (and, conse- 
quently, any f,(x)) is not integrable R, it may happen that “on the whole” 
the sums J(f; ¢) are near to a number J, and the nearer, the smaller max 
(a;—a;_1) is. Thus, the integral B is what may be called “Riemann’s integral 
in measure.” 

This definition has found a rather unexpected application in the theory 
of trigonometric series by the following theorem of Kolmogoroff [14]: 


THEOREM A. If f(x) is integrable L and 


do 
(2) f(x) ~ (ad, cos nx + 5, sin nx), 


n=1 


the (generalized) sum f (x) of the conjugate series 


(3) (a, sin nx — 6b, sin nx) 


is integrable B, and, moreover, (3) is the Fourier (-Denjoy) series of f (x). 


t In other words, for every e>0O there exists a 5=4(e), such that, if only max (a;—a;_1) <4, the 
measure of the set of values of ¢ for which | J—J(f; #)| > is less than e. 


a 
a 
| 
3 
by 


1934] TRIGONOMETRIC AND POWER SERIES 603 


Kolmogoroff’s proof is based on an inequality concerning the measure of 
the set of points for which | f (x)| =R. As the proofs of this inequality, so far 
published [13, 25], are not simple, an alternative proof of Theorem A would 
be, perhaps, of some interest. The proof given below uses a theorem (Theorem 
C) also due to Kolmogoroff, which may be considered now as fairly simple 


(cf. Hardy [5]). 
2. We begin by proving the following theorem of Denjoy [2, 1] (which 


is not necessary for the proof of Theorem A). 
THEOREM B. If f(x) is integrable L in (a, b) it is also integrable B and both 
definitions give the same value of the integral.} 


Let 


b 
r=Of Of ax. 
Integrating (1) we get 


b n b 
a t=1 a 
Suppose that J* <e?/(3(b—a)). Then the left-hand member in (4) does not 
exceed 2/3 and the measure of the set of values of ¢ for which | J(f; t)| >€/3 
does not exceed e. In the general case we put f=f1+/2 and introduce the in- 
tegrals Ji, J:*, Jz, Jo*, analogous to J, J*. We may suppose that f; is continu- 
ous and that J;* < e?/(3(b—a)). Then | J(f;4)| <€/3 except in a set of measure 
<e. On the other hand, if max (a;—a;_:) is sufficiently small, we have for 
every ¢ the inequality |J(f,; 2) —J:| <¢/3 and so (assuming as we may, that 
e<b—-a), 
S + €2/(3(b — a)) + &/(3(b — a)) <e 

except in a set of measure <e. 


3. The theorem which we will use in the proof of Theorem A and which 
we take for granted is as follows. 


TuEoreM C. If f(x) is integrable L over (0, 2), and f (x) is conjugate to f, 
then 


(5) ( f iz) <A, "| f(a) | dx, 


where A, is a constant depending only on e>0. 


t The proof given in the text is due to Dr. S. Saks. 


= 
|| 
| 
a 


604 ANTONI ZYGMUND [July 


Now it is obvious that if we replace in (1) f by f , we obtain the function 
J(f ; t) conjugate to J(f; t). Hence, from (5), with e=} we get 


(6) IF; 1) dt < Ain( at). 


Suppose first that the right-hand member of (6) does not exceed e*/?. Then 
the set of values of ¢ for which J(f ; #) exceeds ¢ is less than e. In the general 
case we put again f=fi+f2, where f; has a continuous derivative (so that f 1 
is continuous) and the integral of | fz| is small. In the equality J( fi 0=J(As) 
+J( fe; t) the term J( f 1; ) is small for every ¢, provided that max (a;—a;_1) is 
small (the Fourier series of f 1 has no constant term) and J (fa; t) issmall, ex- 
cept in a set of small measure. This shows that f is integrable B and the value 
of the integral is 0, as was to be expected. 

4. To prove the second part of Theorem A, we have to show that the 
products f(x) cos kx and f(x) sin kx are integrable B and that the correspond- 
ing integrals are —7b,, k=1, 2,---. We may suppose that 
b= +--+ =a,=),=0. Then it is not difficult to verify that the conjugate 
functions of f(x) cos kx, f(x) sin kx, are f (x) cos kx, f(x) sin kx respectively. 
Hence the products F(x) cos kx, F(x) sin kx are integrable B and their in- 
tegrals over (0, 27) vanish. 


V. ON AN EXTREME CASE IN THE THEORY OF FRACTIONAL INTEGRALS 


1. Hardy and Littlewood have proved [7 | that if f(~) belongs to L?(p >1) 
in an interval (a, 6), where — » <a<b<~, then the function f,(x), the frac- 
tional integral of order a of f(x), belongs to L*, provided that 


(1) 1/p — 1/q =, ?>1. 


As may be shown by very simple examples [7], this theorem is no longer true 
when »=1. The main purpose of this note is to find a substitute theorem for 
this case and to give some indications concerning the case a=1/p. Since 
these theorems have some applications in the theory of Fourier series, Weyl’s 
definition of fractional integral [27] will be more convenient for us and we 
shall use it throughout, instead of the familiar Riemann-Liouville definition. 
According to Weyl’s definition 


fa(x) = O<a<1, 


where the integrable function f has the period 27 and the constant coefficient 
of its Fourier series vanishes. The latter condition will be tacitly assumed 
throughout this paper, wherever it is necessary. 


¢ 


1934] TRIGONOMETRIC AND POWER SERIES 605 


The arguments will be based on the theorem just mentioned, which it 
will be necessary for our purposes to state in its complete form. 


THEOREM 1 (Hardy-Littlewood). If f(x) ¢ L?, p>1, im the interval (0, 27) 


é and if the relations (1) are satisfied, then 
(2) Ma(fa) < MM>(f) 


with 
(3) 


where pis an absolute constant.t 


M = 


2. We begin by proving the following 


THEOREM 1. Suppose that fc L’, r>1, and that M,(f) $1. Then there exist 
4 two constants \>0 and A independent of f, such that 


(4) f exp A | <A. 
0 


This result shows that the function f,,,(x), which by the theorem of Hardy 
and Littlewood is integrable in any power, is integrable exponentially. 


Put in (2) 
rk 
a = 1/r, 
rk 
gq=rk>?, & = 


Then, since f¢ L?, and since Mt,(f) is an increasing function of p we deduce 
from (2), (3) that 


(S) 
< DM-(f) Di, 
where 
D; = = u(r’ k) 
Raising the inequality (5) to the power 7’k, multiplying it by M*/2!, and sum- 
ming from k=2 on, we get, by Stirling’s formula, 


t We use the familiar notation 


The numerical value of u in (3) is irrelevant for our purposes. When the Riemann-Liouville definition 
is used (in the interval (0, ©)) we may put e.g. u=max 1/I'(i+a). For the definition adopted in 
this paper the value of yu ten times as large will certainly be sufficient. 


‘gd 
4 
2 
| 
a 
| 


ANTONI ZYGMUND 


k=2 k! 0 k=2 k! k=2 


Cc 
<——— = B, 
1 — p’’r’de 


(6) 


where C is an absolute constant and J is assumed to be so small that y’’r’he <1. 
Let =e7—1-—x. Noticing that for x20, e7<2y(x)+C (see footnote 
on page 587), and taking into account only the extreme terms of the inequality 
(6), we see that, with a modified value of B, we may replace in the first of 
them the lower limit of summation by 0, and this is just the inequality (4). 

3. If we put (which we have no'right to do) in the second relation (1) 
p=1, we should obtain g=1/(1—a). But the theorem is false for p=1; to 
state the correct form of this extreme case we introduce the class L!* of 
functions f such that | f| (log* | f| )* is integrable. We have then 


THEOREM 2. If f 0 <a<1, then L’, B=1/(1—a), and 


(7) <M f | f| Gogt| +N, 
0 


where the constants M and N do not depend on f.t 


Given any integrable function ¢(x), 0<«<2r, we shall denote by o,[¢] 
the first arithmetical means of the Fourier series of ¢. It is well known that 
the two inequalities 


M(¢) =A, < A, 


where A is a constant and r=1, are equivalent. Therefore, if we wish to prove 
that f. ¢ L’ it is sufficient to show the existence of a number A such that 


(8) | f onLfalg(— x)dx | <A 
0 
for every (periodic) g with Mt,-(g) <1. It is well known that 


f(x) ~ > (co = 0) implies 


fa(x)~ exp ( sg n) gs, 


n=—o 


(9) 


From this it is easily seen that the left-hand member of (8) is equal to 


¢ It may be added that Theorems 1 and 2 are valid in the case of the Riemann-Liouville defi- 
nition, at least if we suppose that the interval of integration is finite. The use of arithmetical means 
in the proof below of Theorem 2 is not essential and could easily be avoided. 


606 [July 


TRIGONOMETRIC AND POWER SERIES 


|. 


Using W. H. Young’s inequality? we see that this expression does not exceed 


Q2r Q2r 
(10) a(| f| dx + f ¥(| | ax, 


where ® and W are conjugate. We take 
(11) W(x) = exp (Ax*) — 1, 


\ being the same constant as occurs in Theorem 1. Since WV is convex, we have 
by the inequality of Jensen and the inequality (4) 


1 2r 


1 


2r 2r 
0 0 0 


where K,,(#) denotes the Fejér kernel. It follows from (11) that, for y large, 
the conjugate function ®(y) is asymptotically equal to \-!/*y(log y)!/8, and 
so the first term in (10) has a finite value M. Consequently (8) is true with 
A=A+M and Theorem 2 is established. 

4. We now prove 


THEOREM 3. If 0<a<1 and fcL', the (complex) Fourier coefficients Cy 
of f satisfy the inequality 


(13) ( < Ae (logt | f| )* dx + Ba, 
0 


n=1 
with A, and B, depending only on a. For a>1 the theorem is false. 
{ Let (x), x20, be a continuous increasing function, with ¢(0)=0, and let ¥(y) be the function 
inverse to ¢(x). If 
z 
0 0 
then, for every 220, b=0, we have 
(*) ab S ®(a) + (bd). 


The sign S in (*) degenerates into = if, and only if, b=¢(a). The functions # and W are called conju- 
gate, of course, in the sense different from that used in the theory of Fourier series. For a very simple 
proof of Young’s inequality (*) see Oppenheim [16]. 


1934] eC 607 
| | 


608 ANTONI ZYGMUND (July 


We assume for simplicity that fis real and so c_,=¢,. Similarly, although 
we suppose in the proof that co=0, the inequality (13) remains valid without 
this assumption. From (9) and Theorem 2 it follows that 


(14) fr-e(x) ~ |e,| = 1. 


n=—o 


Now we use the following theorem of Hardy and Littlewood [6]: 
If L?, 1<p<2, then 


(15) <4, f | dx, 


with A, independent of o. 

Applying this theorem, with p=1/a, ¢=fi_., to the series (14) and 
taking into account (7), we obtain (13) for }<a<1. We shall not consider 
here the case a=1f, and, for 0<a<3, since the inequality (13) can be 
strengthenedf, we shall be contented with proving the convergence of the 
series (13). 

Lemma. Let f(x), g(x) be non-negative in (0, 2) and let o(u), ¥(u), u=0, 
be two non-negative and non-decreasing convex functions. Put 


(16) x(u) = B(x) = f + dat. 
0 


If 
=, V(g(a))dx <1, 
0 0 


f x(h(x))dx S $(f(x))dx. 
0 


0 


Let 1/x, x21, be the value of the second integral in (17). Using twice 
Jensen’s inequality, we have 


x(h(x)) = g(x + nat) | < + | 


0 


Qn Qn 
< + nya | < [ + 
0 0 


Qn Qn Qn 
0 0 0 


0 


t See the next Note VI. 


then 


1934] TRIGONOMETRIC AND POWER SERIES 


Corotiary. If fe go a=0, B20, then he 


Let $(u)=u(logtu)*, =u(logtu)*. We may plainly assume that 
conditions (17) are satisfied. Then it is sufficient to notice that, for w=, we 
have 


x(u) = u(log {log (u log } u(log 


Suppose now that }<a<# and set g=f in the integral (16). It is well 
known that then 


h(x) ~ 29 | cn |? 


n=1 


Since he L!= and }<2a<1, we obtain, by applying Theorem 3 in the 
case already established, the convergence of the series 


n=1 n=1 


We proceed similarly when }<a <3, and so on. 
In order to show that the condition 0<a<1 cannot be removed, con- 
sider the function 


f(x) = > (log )—@(logs cos nx. 
n=2 


It may be shown that in the neighborhood of «x =0*, 


f(x) = 1/x)-(¢+” (logs 1/x)-*], 


and, consequently, f¢ ZL) if only 6>1. If, moreover, 1<8<a, the series 
(18) diverges. To get the needed estimate observe that, by Abel’s transforma- 
tion, 


x 
sin? + 1) 


= a, COS NX = > A’a, 
n=2 n=2 x 

4 sin? — 
2 


Now we break up the last sum into two, the first being extended over the 
range 2<m<1/zx. In the first sum the coefficient of Aa, is O(n”), in the sec- 
ond it is O(x~*). It simplifies slightly the proof if we use the fact that A*a, =>0 
for n=No. 


609 


ANTONI ZYGMUND 


VI. SoME THEOREMS ON FOURIER COEFFICIENTS 
1. Given a sequence of (complex) numbers 1, G2, ¢n, we shall 
denote by ++, the sequence |a|, |co|,---, re- 
arranged in descending order of magnitude. 
Hardy and Littlewood [9] have established the following theorems. 


THEOREM A. Suppose that 


f(x) cne™*, C-n = Cn, 
and that |f| log+|f| ¢L. Then is convergent and exp (—k/| cal) 
is convergent for every k>0. 

THEorEM B. Suppose that (log (1/|cn|))—! is convergent. Then 
dicnei* is the Fourier series of a function f, such that exp (k|f|) is integrable 
for every k>0. 

Our object here is to generalize these theorems in two directions. First, 
we consider slightly more general types of integrability and, secondly, the 
results are extended to general, uniformly bounded, orthogonal systems. 

Let be a system of functions, orthogonal and normal in a 
finite interval (a, 6) and uniformly bounded, 


(1) | on(x)| 


These conditions will be assumed in the following discussion. 


Tueorem 1. |f| (log+f)*¢L in (a, 6), a>0, then 
(i) the series >> exp(—k/|cn|1/*) converges for every k>0. 
(ii) If, moreover, aS1, we have 0. 


THEOREM 2. If the series >~| cn| (log (1/|¢n|))-*, a>0, converges, the series 


(2) 


n=2 


is the Fourier series of a function f such that exp (k|f| 1!) is integrable for every 
k>0. 


Tt The results are stated without proofs. A result less strong than Theorem A, viz., the con- 
vergence of the series =n-| Cal , is proved in Zygmund [29, Theorem 3]. Since the argument used 
there can be applied, with slight modifications, to general uniformly bounded orthogonal systems, it 
yields also the result of Hardy and Littlewood. The latter result is, in turn, contained in the following 
theorem: If f(x) ~co+c1e*+ +++ +cne™*-+ , then the series converges. 

t It is slightly more convenient to denote the system by ¢2, ¢3,+-+ , and not by ¢1, ¢2,°°*. 
Correspondingly, cs*, - - - deuotes the sequence | ce | cs rearranged in descending order. 


610 [July 


1934] TRIGONOMETRIC AND POWER SERIES 


2. The proof will be based on a series of lemmas. 


Lemma 1. Let y=¢(x), x20, be a non-negative, continuous, strictly in- 
creasing function with o(0)=0. Let x=y(y) be the function inverse to $(x). 
Then, for every a, b=0, the inequality 


0 0 


holds. The sign of equality occurs in (3) if, and only if, b=¢(a). (Cf. footnote fF 
on page 607.) 


LemMa 2. Let 


f(x) ~ 2), 


where 


| Cn | < n—(log a > 0, n= 2,3,---. 


Then, forX>0 sufficiently small, we have 


(4) f exp (A| f|/*)dx < A.f 


The function (log «)*~! decreases for x >e*-!. Let mo be an integer >e*-1. 
Without loss of generality we may assume that ¢,=0 for m<m. If u=2, we 
have, by the F. Riesz theorem (cf. M. Riesz [21]), 


fist ues] > n-*’ (log 


n=not+1 


< f x’ (log x) az) 
2 


The last factor does not exceed A*~! where A = A, is a constant independent of 
if only 

Put u=8k, where 8=1/a. Let ko denote an integer, such that Bk =o for 
k2=ko. Then 


ne 
and, by Stirling’s formula, we obtain 


Tt We designate by A any constant (not necessarily the same in all the formulas) which does not 
depend on f. 


611 
= 
n=2 


ANTONI ZYGMUND 


(5) f|*)* < A, 


a 


if only \eA°M*6 <1. Since 


ko—1 


k=kg 


for “=o, the inequality (4) follows from (5). 


Lemma 3. If |f| (logt|f|)*¢L, a>0, and if cn, n=2, 3,---, are the 
Fourier coefficients of f with respect to {dn}, then 


6 
(6) > n-(log n)*-1c,* < A f | f| (logt | f|)edx + A. 


Since the order of the functions ¢, is irrelevant, we may suppose that 
c,.*=|c,|. Put €,=sg c, and consider the partial sums sy of the series 


(7) > (log 


n=2 


Using Young’s inequality we obtain 


N b b 
(8) >> n-(log = f f(x)sw(x)dx S f f| )dx + f sw| 


n=2 
Put 
W(x) = x exp (Apx*) — x, 8 =1/a, and hence 


@(x) ~ (Ao) ~*x(log x)* as &, 


(9) 


where Xo is any positive constant less than the constant A occurring in (4). 
Since W(x) <exp (Ax*), x>2x0, we get, from (8) and (4), 


b 
(10) > n-(log < f f| +A. 


n=2 a 


Since ®(x) x)*, x =20, (6) follows from (10). 
3. Now it is not difficult to prove Theorem 1. Let By denote the right- 
hand side of (6). Then 


Since the coefficient of c,* in the first term is =p(log m)*, p being a constant 


612 [July 

km 

n n 


1934] TRIGONOMETRIC AND POWER SERIES 613 


independent of n, the following three inequalities are consequences of (11): 
(12) S Bop(log m)-*, log S (Bo/(pcn*))®, m S exp (Bo/(pc,*))?. 


From the second of them and from (6) we get 


a 
( < = f | f| | f| + at 
n=2 a 
which is the second part of Theorem 1.f To prove the first part, we notice 
that the function x~—'(log x)*-! decreases for «=m,)2e*—!, and so, from the 
third inequality (12) and (6), we obtain 


n-(log { exp (— Bi = (Bo/p)’, mo, 


This gives statement (i) of Theorem 1, for some k>0. To prove it for every 
k>0O it suffices to notice (rejecting a large number of terms from the con- 
vergent series (10)) that c,=o((log m)-*), and to repeat the previous argu- 
ment. 

4. We now pass to Theorem 2. 


4. Let ¢n>0, bn 20n41>0, a>0, and 


1 
(13) <Ca< 0, (log db, < Be < om. 


n=3 Cn n=3 


There exists a number a >0 depending only on a, and such that 


(14) S = S (Cat PBe- 


n=3 
From the second inequality it follows that 5, < B.p~! (log 2)-*. Break up 
the sum S into two, S=5S,+5S2, where S, contains the indices m for which 
b, <oB, (log (1/c,))—*, « being defined by the equality (pc) =3. It is obvious 
that S,;<oC,B,. If m occurs in S2, we have 


Bap~\(log m)-* 2 by = Bao(log 1/cq)-*, 


and hence c, <n~*. Therefore 


> n~*(log n)—'b, S n— (log n)—1b, 
n=3 n=3 
Ba 


n=3 


1 
<— n— (log n)*—1b, = =" 


n=3 


t The condition aX 1 is essential. See V, Theorem 3. 


614 ANTONI ZYGMUND [July 


and the lemma follows. 

Suppose now that in the series (2) we have not only c;=0, but that also 
a number of subsequent coefficients vanish, c= --- =¢,,=0, mo being so 
large that 


Co= | (log 1/| 1/(2¢). 
n=ngt+1 
It follows that the coefficient of B, in (14) does not exceed 1 if C, is replaced 
there by Co. Let sy, N >mo, denote the partial sums of the series (2), and let 
g be any function with Fourier coefficients 6, and &(|g|) integrable. Then 


b N 
f Sygdx| = |< | cn | | b, |. 
a n=ngt+1 n=ngtl 
On rearranging the terms in the last sum according to the decreasing magni- 
tude of |, | and applying Lemma 4, and the inequality (10) (with a slightly 
different notation where c,*, f have been replaced by b,*, g), we get 


b b 
(15) f svgdx | S >> n-(log < f #(| g| )dx + D. 
a n=2 a 


On the other hand, if g is chosen conveniently (see Lemma 1) the left-hand 
member in (15) is equal to 


the last integral being finite. Comparing this with the right-hand side of (15), 
we get 


(16) f sD. 


By the theorem of Riesz-Fischer, the series (2) is the Fourier series of a func- 
tion f ¢ L* and a subsequence of {sy} converges almost everywhere to f. By 
Fatou’s well known lemma, the inequality (16) implies 


b 
f f| < D. 
It follows that exp (k|f|*) is integrable for some k>0. Rejecting the restric- 
tion concerning the first coefficients of the series, we may assert the integra- 
bility of exp (k|f—sn,|*), where mp is sufficiently large. Since s,, is bounded, 
exp (k| f|*) is again integrable for some k>0. 


¥ 
g 
% 


1934] TRIGONOMETRIC AND POWER SERIES 615 


To prove that it is integrable for every k >0, it suffices to observe that for 
any \>0 and =Xc, the series | (log (1/|c,’ |))-* converges and so 
exp (kA|f|*) is integrable for every \>0. 

5. We now prove 


THEOREM 3. If Don’-"|c,|", r>1, converges, the series (2) is the Fourier 
series of a function f such that exp (k|f|*’) is integrable for all values of k>0. 


We shall only sketch the proof, which is analogous to, and even a little 
simpler than, that of Theorem 2. Using Hélder’s inequality we see that 
the series }'c,b, converges, even absolutely, for any {b,}, such that 
oo. In particular, it converges if }, are the Fourier coefficients 
of a function g such that |g|(log*|g|)!/" is integrable (see (iii) of Theorem 
1). Since, roughly speaking, exp x”’ and x (log x)?/”’ are conjugate in the sense 
of Young, the integrability of exp (k| f|’’) for some k>0, and hence for every 
k>0, follows. 

Remark. In the case of trigonometric series and r=>2, Theorem 3 is a cor- 
ollary of Theorem 1, Note V (using the inequality (15) of that note). 


VII. ON A THEOREM OF PALEY AND WIENER 


In a recent paper Paley and Wiener [17] proved the following theorem: 
If f(x) is defined over (—7, 7) as an odd function and is non-decreasing and 
integrable over (—z, 7), then its conjugate function f (x) is also integrable. 
Here we propose to give a simpler proof of this theorem, or rather of an 
equivalent 

Tueorem. If f(x) is odd in (—7, 1), non-increasing and integrable over 
(0, x), then its conjugate function f (x) is also integrable. 


The theorem is trivial if f(x) is bounded on (0, 7). On the other hand there 
is no loss of generality if we assume that f(x) is not bounded only in the neigh- 
borhood of « =0 and that f(«) 20, 0<x<z7. Our proof is based on the follow- 
ing obvious 

Lemma. If f(x) is integrable over (0, 1) then the functions 


= fe 


are also integrable. 


Now, assuming 0<x <7, we have 


3 
k 
a 


616 ANTONI ZYGMUND 


—2z/2 —32/2 32/2 


= Ji(x) + Jo(x) + J3(x) + Ja(x). 
Here 


0 


7 
[va ax +0 


o( 
ot 


| J | 


| ax 
t w/2—t 


This proves that f(x) is integrable over (0, z). 


O 


BIBLIOGRAPHY 


1. T. J. Boks, Sur les rapports entre les méthodes d’intégration de Riemann et de Lebesgue, Rendi- 
conti del Circolo Matematico di Palermo, vol. 45 (1921), pp. 211-264. 

2. A. Denjoy, Sur l’intégration riemannienne, Comptes Rendus, vol. 169 (1919), pp. 219-221. 

3. L. Fejér, Ueber die arithmetischen Mittel erster Ordnung der Fourierreihe, Géttinger Nach- 
richten, 1925, pp. 13-17. 

4. L. Fejér and F. Riesz, Ueber einige Junktionentheoretische Ungleichungen, Mathematische 
Zeitschrift, vol. 11 (1921), pp. 305-314. 

5. G.H. Hardy, Remarks on three recent notes in the Journal, Journal of the London Mathematical 
Society, vol. 3 (1928), pp. 166-169. 

6. G. H. Hardy and J. E. Littlewood, Some new properties of Fourier constants, Mathematische 
Annalen, vol. 97 (1927), pp. 159-204. 


(July 


1934] TRIGONOMETRIC AND POWER SERIES 617 


7. G. H. Hardy and J. E. Littlewood, Some properties of fractional integrals, Mathematische 
Zeitschrift, vol. 27 (1927-1928), pp. 565-606. 

8. G. H. Hardy and J. E. Littlewood, On absolute convergence of Fourier series, Journal of the 
London Mathematical Society, vol. 3 (1928), pp. 250-253. 

9. G. H. Hardy and J. E. Littlewood, Some new properties of Fourier constants, Journal of the 
London Mathematical Society, vol. 6 (1931), pp. 3-9. 

10. E. Hille, Note on the behavior of certain power series on the circle of convergence, Proceedings of 
the National Academy of Sciences, vol. 14 (1928), pp. 217-220. 

11. E. Hille, Note on a power series considered by Hardy and Littlewood, Journal of the London 
Mathematical Society, vol. 4 (1929), pp. 176-182. 

12. A. Kolmogoroff, Une série de Fourier-Lebesgue divergente presque partout, Fundamenta 
Mathematicae, vol. 4 (1923), pp. 324-328. 

13. A. Kolmogoroff, Sur les fonctions harmoniques conjuguées et sur les séries de Fourier, Funda- 
menta Mathematicae, vol. 7 (1925), pp. 23-28. 

14. A. Kolmogoroff, Sur un procédé d’intégration de M. Denjoy, Fundamenta Mathematicae, 
vol. 11 (1928), pp. 27-28. 

15. Marcinkiewicz, Sur la divergence des séries de Fourier, to appear in the Fundamenta Mathe- 
maticae. 

16. A. Oppenheim, Note on Mr. Cooper’s generalization of Young’s inequality, Journal of the 
London Mathematical Society, vol. 2 (1927), pp. 21-23. 

17. R. E. A. C. Paley and N. Wiener, Notes on the theory and application of Fourier transforms, 
Note II, these Transactions, vol. 35 (1933), pp. 354-355. 

18. R.E.A.C. Paley and A. Zygmund, On the partial sums of Fourier series, Studia Mathematica, 
vol. 2 (1930), pp. 221-227. 

19. B. N. Prasad, On the summability of Fourier series and the bounded variation of power series, 
Proceedings of the London Mathematical Society, (2), vol. 36 (1933), pp. 407-424. 

20. F. Riesz, Sur les polynémes trigonométriques, Comptes Rendus, vol. 158 (1914), pp. 1657- 
1661. 

21. M. Riesz, Sur les maxima des formes bilinéaires et sur les fonctionnelles linéaires, Acta Mathe- 
matica, vol. 49 (1926), pp. 465-497. 

22. M. Riesz, Sur les fonctions conjuguées, Mathematische Zeitschrift, vol. 27 (1927-1928), 
pp. 218-244. 

23. O. Szdsz, Ueber den Konvergenzexponenten der Fourierschen Reihen gewisser Funktionen- 
klassen, Sitzungsberichte der Bayerischen Akademie der Wissenschaften, Mathematisch-Physikal- 
ische Klasse, 1922, pp. 135-150. 

24. G. Szegé, Ueber einen Satz der Herrn Serge Bernstein, Schriften der Kénigsberger gelehrten 
Gesellschaft, Naturwissenschaftliche Klasse, vol. 5 (1928), pp. 59-70. 

25. E. C. Titchmarsh, On conjugate functions, Proceedings of the London Mathematical Society, 
(2), vol. 29 (1929), pp. 49-80. 

26. Z. Waraszkiewicz, Remarque sur un théoréeme de M. Zygmund, Bulletin International de 
Académie Polonaise, Classe des Sciences Mathématiques et Naturelles, (A), 1929, pp. 275-279. 

27. H. Weyl, Bemerkungen zum Begriff der Differentialquotienten gebrochener Ordnung, Viertel- 
jahrsschrift der Naturforschenden Gesellschaft in Ziirich, vol. 62 (1917), pp. 296-302. 

28. A. Zygmund, Remarque sur la convergence absolue des séries de Fourier, Journal of the London 
Mathematical Society, vol. 3 (1928), pp. 194-196. 

29. A. Zygmund, Sur les fonctions conjuguées, Fundamenta Mathematicae, vol. 13 (1929), pp. 
284-303. 

30. A. Zygmund, On a theorem of Privaloff, Studia Mathematica, vol. 3 (1931), pp. 239-247. 


UNIVERSITY OF VILNA, 
Vitna, POLAND 


THE RIEMANN MULTIPLE-SPACE AND 
ALGEBROID FUNCTIONS* 


BY 
B. O. KOOPMAN# anv A. B. BROWN 


1. Introduction. The present paper considers the extension of the Riemann 
surfacef to the case of several complex variables. The resulting configuration 
will be called Riemann multiple-space§ (R. M. S.), and the first object is to 
give its construction, or definition. It is then shown that the R. M. S. isa 
generalized manifold.|| The property of being a generalized manifold is shown 
to be a topologically invariant property of a complex, and a simple characteri- 
zation of a GM, is given. The locus of non-spherical points] of the R. M.S. 
is proved to be a sub-complex of dimension not greater than 27 —4,** where 
n is the number of independent variables; an example due to Osgood{{ shows 
that it can actually attain that dimensionality. 

2. Properties of the generalized manifold. We prove certain properties 
which are needed in what follows. 


Lemma 1. A generalized n-manifold is a simple n-circuit.tt 


* Presented to the Society, October 28, 1933; received by the editors November 28, 1933. 


t Some of the results of the present paper were announced in preliminary form in the abstract 
bearing the same title presented by B. O. Koopman (at that time National Research Council Fellow) 
in the Bulletin of the American Mathematical Society, vol. 33 (1927), p. 406. 

¢ For a treatment of the case of one independent variable, see H. Weyl, Die Idee der Riemann- 
schen Flache, 2d edition, Leipzig, 1923. 

§ Terms often used are Riemann hypersurface or Riemann space; but it seems undesirable to 
use these, inasmuch as their use in the present connection involves a contradiction with other stand- 
ard mathematical usage. 

|| O. Veblen, Analysis Situs, chapter III, pp. 95-96 in second edition; Colloquium Series, vol. 5, 
part 2, New York, 1931. A generalized manifold of m dimensions (GM,) is defined as the set of points 
on an n-circuit such that the cells of higher dimensions incident with any given i-cell have the inci- 
dence relations of a GMn_;-1. The only GM is a pair of 0-cells. 

Terminology will be as defined in Veblen, or as in Lefschetz’s Topology, Colloquium Series, vol. 
12, New York, 1930. (Lefschetz I.) 

An n-circuit is an m-complex which (1) is the closure of its n-cells; (2) has an even number of 
n-cells incident with each of its (n—1)-cells; (3) contains no proper sub-complex satisfying (1) and (2). 

4] A point of a k-complex will be called a spherical point if it has a neighborhood on the complex 
which is homeomorphic to a k-cell. 

** Note that this result does not of itself imply that the R. M.S. is a generalized manifold, nor 
does the latter imply the former. 

tt W. F. Osgood, Lehrbuch der Funktionentheorie, vol. 2, first part, chapter 2, §21. (Osgood II.) 

tt A simple -circuit is an n-circuit each of whose (n—1)-cells is incident with exactly two of its 
n-cells. 


618 


THE RIEMANN MULTIPLE-SPACE 619 


This lemma is stated for convenience in reference. It follows from the 
facts that the m-manifold is an n-circuit and that a GM) is a pair of 0-cells. 


Lemma 2. A definition of GM,, n>0, equivalent to the original one, is the 
following. A GM, is a connected n-complex K,, such that the cells of higher dimen- 
sions incident with any given i-cell have the incidence relations of a GMy_i-1. 


Using induction, we assume the lemma proved for dimensions less than n. 
As our proof will require no assumption for the case nm =1, it remains only 
to prove the lemma for general » under the assumption of the induction. 

The new definition differs from the original one only in the replacement 
of “n-circuit” by “connected n-complex.” As any n-circuit is connected, we 
need merely show that under the new definition a GM, is an n-circuit. As 
the cells of higher dimensions incident with any i-cell have the incidence 
relations of an (n—i—1)-complex, it follows that every point of K, is on 
the closure of at least one m-cell. As the incidence relations between the cells 
incident with any (n—1)-cell Z,_: are those of a GMs, it follows that E,_1 is 
incident with just two n-cells. Hence K, is an m-cycle each of whose (n—1)- 
cells is incident with just two n-cells. 

If K, were not an m-circuit there would be two sub-complexes M,} and 
M7;?, each an n-cycle, containing all the m-cells of K, but having no common 
n-cells. As M, is connected, M,} and M,? would have at least one common 
cell, say an i-cell Z;. As the lemma is assumed true for dimensions less than n, 
the cells E‘ of higher dimensions, of K,, incident with E;, would have the 
incidence relations of an (w—i—1)-circuit c,_;1. Since M; and M? are n- 
cycles, those of the cells of the set E‘ belonging to M,/ would have the inci- 
dence relations of an (n—i—1)-cycle ¢nii+1, 7=1, 2, which could be consid- 
ered as a sub-complex of c,_;-1. But that is impossible, as an (w —i—1)-circuit 
cannot have two sub-complexes each of which is an (n—i—1)-cycle, and dis- 
tinct. Hence K,, must be an m-circuit, and the proof is complete. 


Lemma 3. A complex K,, is a generalized manifold if and only if it is con- 
nected, is the closure of its n-cells, and satisfies the following condition: If Kj, is 
the set of spherical points of Kn, Kn is locally connected* by curves which can be 
taken in K}, whenever their end points are in Ky. 


To prove the necessity, let P be any point of K,, and £; the cell of XK, 
on which it lies. Let N be any neighborhood of P on K,, and N’cWN the 
neighborhood consisting of all the cells of K,’ on whose closures P lies, where 
K,! is the complex obtained by subdividing K, regularly enough times so 


* Local connectedness in the ordinary sense is meant; in the terminology of Lefschetz I 
this means local 0-connectedness. 


620 B. O. KOOPMAN AND A. B. BROWN [July 


that such an NV’ exists. Let P; and P2 be any two points in VN’. Now we con- 
sider E; and all the cells of higher dimensions that are incident with it. The 
latter have the incidence relations of a generalized (n —i—1)-manifold. As this 
manifold is, according to Lemma 1, a simple circuit, we can name a sequence 
of (n—i—1)- and (n—i—2)-cells such that the corresponding sequence of 
n- and (n—1)-cells of K, has the following properties: (1) each cell is incident 
with the adjacent ones in the sequence; (2) Pi: is on the first cell or on its 
boundary; (3) Ps: is on the last cell or on its boundary. The rest of the proof 
is obvious. 

To prove the sufficiency, suppose the condition satisfied and let 7 be an 
integer such that for every E,, r <i, the incident cells E’ of higher dimensions 
have the incidence relations of a GM,_,:. We shall prove that the property 
holds also when r=7. Let E; be any i-cell. The cells E‘ of higher dimensions 
incident with £; have the incidence relations of a complex k,_;-1, since K, is 
the closure of its -cells. Let Ei,; be any (i+ 7)-cell incident with £;. By 
hypothesis, the cells E‘+’ incident with £;,; have the incidence relations of a 
GM,_i-;-1. Now in considering k,_::, Ei;; corresponds to a (j—1)-cell 
é*4 of kni+. As the incidence relations of the cells of E‘+i are the same 
as the incidence relations of the corresponding cells of &,-:-1 incident 
with 6% it follows that the latter relations are likewise those of a 
GM,_;-;-1. Thus P satisfies the condition imposed on a (j—1)-cell and its 
incident cells of higher dimensions on k,_;-; in order that k,_;-, should be a 
GM,_:-1. Since a similar statement can be made for any (+ )-cell incident 
with E;, 7>0, it follows from Lemma 2 that each connected part of k,~i-1 
isa GM,-i-1." 

Now if k,-:-1 were not connected, we could let P’ and P? be points on 
n-cells of K, corresponding to (n—i—1)-cells of two unconnected parts of 
k,-:-1, sufficiently near to some point P of E; to satisfy the condition of 
Lemma 3 for some neighborhood N of P containing no points on the boundary 
of £;. Then a curve C would exist joining P! to P? on K,* and in N. Then C 
would contain a point Q on £;, as no cell of the group corresponding to the 
first part of k,-:-1 could be incident with any cell of the group corresponding 
to the second part. Since Q would have an n-cell neighborhood, by the in- 
variance of the combinatorial manifoldt it follows that k,:; would be a 
combinatorial (7—i—1)-sphere. As the latter is connected, we would then 
have a contradiction to the hypothesis that k,_:-1 is not connected. Conse- 
quently it is connected, and therefore a GM,_;-1. 


* Cf. Veblen, loc. cit., pp. 96-97. 

¢ E. R. van Kampen. For references, see Lefschetz I. The linked complex of E; has the construc- 
tion of a regular subdivision of kn_,-1. We have not used linked complexes in the proofs, as they 
would have necessitated longer proofs. 


1934] THE RIEMANN MULTIPLE-SPACE 621 


It now follows by induction that for every r-cell EZ,, r=0,1,---,n—1, 
the incident cells of higher dimensions have the incidence relations of a 
GM,_,-1. Since K, is connected, it follows from Lemma 2 that K, isa GM,, 
and the proof of Lemma 3 is complete. 

The following combinatorial characterization of the GM, is an easily 
proved consequence of Lemma 3. 


CoROLLARY. A necessary and sufficient condition that a complex K be a 
GM, is that it have the following properties: firstly, it is connected; secondly, 
every cell is an n-cell or on the boundary of an n-cell; thirdly, given any i-cell E;, 
i<n, and any two n-cells E,\ and E,? incident with E;, there exists a sequence 
of cells of K having the following properties: (1) E,} is the first cell of the sequence 
and E;? is the last; (2) the cells of the sequence are alternately n-cells and (n—1)- 
cells; (3) each cell of the sequence is incident with the adjacent cells of the se- 
quence; (4) all the cells of the sequence are incident with E;. 


This result might be described more briefly in the following terms. A 
generalized n-manifold is a connected n-complex which is locally a simple n- 
circuit.* 

3. The Riemann multiple-space. Let a region R be given in the (27+2)- 
space of the complex variables w, 21,---, 2,, together with a function 
F(w, 21, - , Zn) =F (w,z) with the following properties: (1) F is single-valued 
and analytic at all points of R; (2) F=0 for some points in R; (3) if we con- 


tinue analytically from a point P at which F =0, over a path which may go 
outside R, and return to the point P, then if the continued function vanishes 
at P it must be identical with the original function F(w, z) at P; (4) F is ir- 
reducible, that is, it is not a product of two functions each satisfying the pre- 
ceding conditions and having the same locus of points when equated to zero.t 

Given a point P on the locus F =0, let F be factored into a product of ir- 
reducible analytic factors F‘, each vanishing at P. It will be proved below 
that no two of the F‘ can be equivalent at P.f{ 

The Riemann multiple-space for the locus F =0 in R is defined as the follow- 
ing Hausdorff space. A point P on F =O together with one of the irreducible func- 
tions F‘ at P, (P, F*), constitute a point of the space. If F‘ and F’ are equivalent 
at P, (P, F*) and (P, F*) are the same point of the R. M.S. A neighborhood of 
(P, F‘) consists of the set of points (Q, F) for which Q is in a neighborhood of P 
on F =0 in (w, 2)-space, and F‘=F'® at and near Q, where ® is analytic at Q. 


* Because of this fact, it may seem that we could have dispensed with the entire section on the 
GM,. However this is not the case, as the results are used in the later proofs. 

{ In order to treat a function such as w—log 2, near a point at which w—log z=0 we use the 
branch which vanishes at the point. 

} F! and F? are equivalent at P if F!'=F? 2 at and near P, where @ is analytic and not zero at P. 


622 B. O. KOOPMAN AND A. B. BROWN [July 


From this definition it is evident that the topological properties of the 
R. M. S. are independent of any change in coordinates. In order to carry 
through our later proofs we make a change of coordinates if necessary, so 
that for no (z°) is F(w, 2°) zero for all w neighboring any value determining a 
point in R. That this can be done follows easily from a theorem of the authors 
dealing with a somewhat similar situation in the case of reals.* 

Given a point (w®, 2°), with F(w®, 2°) =0, we apply the Weierstrass Prepa- 
ration Theorem, giving us, near (w®, 2°), 


(3.1) F(w, 2) = z)]2(w, 


Here Q is analytic and not zero at (w®, 2°), the product is finite, and F; is an 
irreducible algebroid polynomial, in general not singular, with vertex at (z°). 
Thus F, has the general form 


F,(w, Z) wr + + + 


where the y’s are analytic at (z°), and all the roots of F; coincide in the value 
w® when (z) =(z°). 

From the properties of algebroid polynomialsf{ it follows that these F;’s 
can be taken as those mentioned in the definition of R. M. S. No two of 
them are equivalent, since in that case they would be identical and from the 
hypotheses on F and R it would follow that F(w, z) would be reducible, con- 
trary to hypothesis. 

We observe that to each point of the locus F =0 correspond one or more 
(but a finite number of) points of the R. M. S. The points of the R. M. S. 
shall at times be considered in association with the corresponding points in 
(w, z)-space on the locus F =0, and at other times, as is ordinarily the case 
when #=1, in association with the corresponding points in (z)-space. 

According to Theorem 6.II of KB, if any closed sub-set of R is given, a 
complex Ke2,+2 can be found containing the sub-set, such that the locus F =0 
in Ken42 is a sub-complex of even dimension, with analytic cells. In this case 
the dimension is 2m, and we denote the sub-complex by Ken. We denote by 
Kone the complex of all cells of Ke, of dimensions less than 2m —1. 

* On the covering of analytic loci by complexes, these Transactions, vol. 34 (1932), pp. 231-251; 
Theorem 5.1. We shall refer to this paper as KB. On p. 233 of this paper the words “In irreducible-C 
factorization” should be inserted at the beginning of the last sentence in Theorem 2.V, and also in 
Corollary 2.VI. In the last line on p. 233 the words “at the same points as” should be replaced by 
“identically if and only if the same is true for.” 

See also S. Lefschetz and J. H. C. Whitehead, Analytical complexes, these Transactions, vol. 35 
(1933), pp. 510-517; §4. 


t Osgood II, chapter 2, §2. 
t Osgood II, chapter 2, §§5, 7. 


1934] THE RIEMANN MULTIPLE-SPACE 623 


We shall now state and prove a simple set of rules for determining the 
R. M. S., and shall later use these rules in establishing certain properties of 
the R. M. S. 


Lema 4. The R. M.S. can be determined from Ke, as the locus Le, which 
we now describe. We keep all of the 2n- and (2n—1)-cells of Ken. At each point, 
say T, of any cell Ey, p<2n—1, we consider all the incident (2n—1)- and 2n- 
cells, and using them alone apply the test described in the concluding sentence of 
Lemma 3 to the neighborhood of T, finding that the incident 2n- and (2n—1)- 
cells are thus grouped into a finite number of sets, for each of which the condition 
of Lemma 3 is satisfied. For each of these sets we assign a point to Lon, corres pond- 
ing to T. Then Ley, consists of (Ken—Ken-2) and these new points, with neigh- 
borhood on Lz, determined as on Kon except at points corresponding to points 
on Ken-2, where it is determined in an obvious manner by use of the incident 2n- 
and (2n—1)-cells appearing in the tests mentioned above. 


In applying this procedure at the boundary of Ke,+2, we must consider 
K>, enlarged by the addition of part of the locus F =0 outside of Ken+2. 

Before proving the lemma we observe that we shall show later that Le, 
is a complex and that, as we should expect from the above lemma, correspond- 
ing to each cell of Ken» is a finite positive number of cells of L2,; but corre- 
sponding to each cell of Ken—Ken-2 is just one cell of Len. 

We begin by observing that each point of Kon — Ken_2 yields just one point 
of the R. M. S.: the locus in (z)-space where values of w coincide is defined 
by equating discriminants to zero, hence is at most (27—2)-dimensional. If 
any point of a 2n-cell or of a (2m—1)-cell projected onto that locus, every 
point of the cell would project onto the locus, as all points of a cell have simi- 
lar neighborhoods, and the cells project in one-to-one manner onto cells of 
the (z)-space. (The cells are obtained from cells in the (z)-space by two suc- 
cessive steps of the kind described on page 249 of KB, where at each step 
we obtain a cell of the first class.) Since the locus in question in (z)-space is 
at most (2m —2)-dimensional, we would then have a cortradiction to the in- 
variance of dimensionality.* Consequently, at each point of Kena—Ken-s, w 
is a single-valued, and hence analytic, function of the 2’s, and according to 
the definition of R. M. S. each such point therefore yields just one point of 
the R. M. S. 

Now consider the points of the R. M. S. corresponding to a given point 
P of Kens. For each of the irreducible functions F‘, vanishing at P, into 
which F factors, we obtain a point on the R. M. S. Let P® be the projection 
of P on (z)-space. Analytic continuation in a neighborhood of P°, avoiding 


* Brouwer. See Lefschetz I for references. 


624 B. O. KOOPMAN AND A. B. BROWN [July 


points where the discriminant of F vanishes, never leads from one function 
F‘ to a distinct function F’,* and furthermore such continuation can be made 
a test for distinguishing the functions F‘. In so testing, the paths can be made 
to avoid the projection, K2,-2, on (z)-space of K2,-2 without affecting the re- 
sults, since any path avoiding points where the discriminant of F vanishes 
can be deformed into a path of the kind wanted in such a way that none of the 
intermediate positions of the path pass through any point of the part of 
K%n-2 for which the discriminant vanishes. This is because (z)-space is 2n- 
dimensional. Consequently, for each point of the R. M. S. corresponding to 
P the part of the R. M. S. corresponding to (Ke, —Ken-2) hangs together near 
the point in the way described in Lemma 3, and must therefore be one of the 
sets designated in Lemma 4. This proves that the process of Lemma 4 de- 
termines all of the points of the R. M. S. corresponding to P, and each on the 
boundary of the proper cells of (Ke,n—Ken—2). 

Since each such set of cells of (Ke,—Ke,-2) must determine one of the 
functions F’, it follows that no unwanted points are determined by the process 
of Lemma 4. 

Consequently we have exactly the R. M. S. determined, and the proof 
of Lemma 4 is complete. 


Lemma 5. The R. M.S. (locus L2n) is a complex. 


We begin with the cells of Ke,—Ken-2, which can be taken as part of a 
representation of Z2,, as we have already seen. Now consider points of 2, 
arising from points of Ke,_2, in the light of Lemma 4. All points of a given 
cell of Ken: have similar neighborhoods on Ke,—Ken-2, in fact, neighbor- 
hoods which are composed of parts of the same cells. From that fact and 
Lemma 4 it follows that corresponding to each cell of Kz,» we have a finite 
number of cells of points of Z2,, each incident with certain of the cells of 
higher dimension of Ke, — Ken-2. Now Lz», is closed, as follows upon considera- 
tion of Lemma 4, and of the fact that if a given cell of a complex is incident 
with certain cells of higher dimensions, then any cell on its boundary is inci- 
dent with these cells of higher dimensions. Consequently Lz, is a complex, as 
we wished to prove. 


THEOREM 1. The Riemann multi ple-s pace (L2,) is a set of generalized mani- 
folds (mod boundary of Ken+2). 


By this we mean that it is a complex consisting of a number of parts each 


* Osgood II, chapter 2, §§10, 11. We do not find there a general treatment of R.M.S., as the 
points for which the discriminant /anishes are not treated. 


& 
> 
j 


1934] THE RIEMANN MULTIPLE-SPACE 625 


of which satisfies the definition of generalized manifold except at the boun- 
dary of Ken+2. 

According to Lemmas 5, 4 and 3, Le, satisfies the condition for a set of 
generalized manifolds, except at the boundary of Ken,2. Consequently Theo- 
rem 1 is valid. 

4. Non-spherical points. Any point of Zz, which does not have a neighbor- 
hood on the R. M. S. homeomorphic to a 2n-cell shall be called a non-spherical 
point. We shall prove that the non-spherical points form a sub-complex of 
dimension not greater than 2” —4. 


THEOREM 2. The R. M.S. Len can be formed from Ken (locus F=0), by 
the process described by Veblen,* used in his proof that every n-circuit is a singu- 
lar generalized n-manifold. 


With the 2n- and (2n—1)-cells we get the correct result, since each 
(2n—1)-cell not on the boundary of Ken+2 is incident with just two 2n-cells. 
Now we use induction, supposing that we have got the correct result with 
all cells down to those of dimension +1, and next consider those of dimen- 
sion p. Under each of the two methods, that given by Veblen and that given 
in Lemma 4 (under the test of Lemma 3), we replace a given p-cell E, of 
Kon-2 by a finite number of f-cells, each incident with certain groups of cells 
of higher dimensions. In the first case, we have one p-cell for each group of 
incident cells of higher dimensions which remain connected near E, when E, 
is removed, and in the second case we have a similar test, but consider only 
the incident cells of dimensions 2” and 2n—1. But from the corollary to 
Lemma 3 we see that, since we know that we have a generalized manifold 
insofar as cells of dimensions greater than pf are tested, we obtain the same 
result by each of the two methods. Consequently Theorem 2 is valid. 


THEOREM 3. The non-spherical points of the R. M. S. (Len) form a sub- 
complex of dimension at most 2n—4. 


As the set of spherical points is evidently an open set on L2n, the set of 
non-spherical points must be closed. As it must consist of a certain number of 
cells, it is therefore a complex. It remains to prove that this complex is of 
dimension at most 2n —4. 

It is shown in KB{ that near any point P on the locus F =0, above a point 
where the discriminant is zero, but where the discriminant of the discriminant 
is not zero (upon second application of the Weierstrass Preparation Theo- 
rem), the locus of similarly described points near P is obtained by equating 


* Loc. cit. 
T §4, pp. 236-242. 


ve 
q 
3 


626 B. O. KOOPMAN AND A. B. BROWN 


w and zg, each to an analytic function of (21, - - - , Zn-1). We denote by Jon_2 a 
(2n—2)-cell neighborhood of P consisting of such similar points. Let Jin—2 
denote the projection on (z)-space of Jen_2, with equation 2, =W(z1, - Zn—1). 
We now cover a neighborhood of the points of J2n-2 in (z)-space by a set of 
analytic cells of dimensions 2m —1 and 2n, as follows. Let E2,_2 be a flat cell, 
part of the locus y, =0 in the 2-space of the complex variables (1, - - - , Yn). 
Cover a neighborhood of Een in (y)-space by E2,-2 and a set of flat (2m —1)- 
cells and 2n-cells, each incident with E2,-2, alternating in order, arranged in 
cyclic order. Next make the transformation (with non-vanishing Jacobian) 
yi=2;, 7=1,---, n—1, and Zn1). This transformation 
gives us the set of cells covering a neighborhood of J2,_2, that we wanted. 
Above any of the 2v- or (2m—1)-cells of this neighborhood, near J2,-2, w 
equals a finite number of distinct-valued analytic functions of (z) =(a,---, 
Z,), since these cells contain no points for which the discriminant vanishes. 
Corresponding to a circuit of the 2m- and (2m—1)-cells incident with Eon_2 
in (y)-space we will now have a circuit around J2,_2 (we can consider a curve 
going around it), and if we go around enough times (a finite number) we must 
come back to the original value of w, hence back to the original point of the 
R. M. S. at which we started the curve. Hence the point P has a neighborhood 
on L», consisting of Jen_2 and a set of incident 2”- and (2n—1)-cells (not cells 
of Le,) arranged in cyclic order, and alternating. Of course, P might be on a 
(2n —3)-cell of Le», or even on one of lower dimension, but that does not affect 
our work. Consequently P is a spherical point. 

Therefore the non-spherical points of Zz, must project onto points of (z)- 
space for which the discriminant of the discriminant is zero. The locus of such 
points is at most (27—4)-dimensional, and hence the locus of non-spherical 
points cannot contain any cell of dimension higher than 2n—4. For such a 
cell would project onto (z)-space in a cell of the same dimension = 2n—3, 
which would contradict the result just obtained. Thus the proof is complete. 


COLUMBIA UNIVERSITY, 
New York, N. Y. 


ON A CERTAIN CORRESPONDENCE BETWEEN 
SURFACES IN HYPERSPACE* 


BY 
V. G. GROVE 


1. INTRODUCTION 


Consider a surface S and a point x on S. Let the parametric vector equa- 
tion of S be 


(1) x = x(u, v). 


The ambient space of the osculating planes at the point x to all of the curves 
through x is a certain space S(2, 0) called the two-osculating space of S at x. 
This space is determined by the six points 


(2) X, Xu, Xvy Xuuy Vuvy 


It is the purpose of this paper to find all surfaces S in one-to-one point 
correspondence with S, such that the two-osculating space 5(2, 0) of S coin- 
cides with the two-osculating space $(2, 0) of S at corresponding points. We 
shall find that the surface S is not arbitrary, but that the functions x satisfy 
certain third-order partial differential equations studied by Laneft and by 
Bompiani.{ A similar statement holds for the surface S. 

Let the surfaces S and S be in one-to-one point correspondence so that 
the corresponding points have the same curvilinear coordinates. 

In order that S(2, 0) at coincide with S(2, 0) at x, it is necessary and 
sufficient that the functions 


(3) Xu; Xv; Xuuy Lov 


be expressible as linear, homogeneous functions of the functions (2). The 
parametric vector equation of S will therefore be of the form 


(4) E = £(u,v) = Axuu + + + + + yx. 


We shall call the case in which S(2, 0) is a space of five dimensions and in 
which the coefficients A, B, C of (4) satisfy the inequality 


* Presented to the Society, April 7, 1934; received by the editors February 20, 1934. 

t E. P. Lane, Integral surfaces of pairs of partial differential equations of the third order, these 
Transactions, vol. 32 (1930), pp. 782-793. Hereafter referred to as Lane, Surfaces. 

t E. Bompiani, Determinazione delle superficie integrali d’un sistema di equazioni a derivate par- 
siali lineari ed omogenee, Rendiconti del Reale Istituto Lombardo di Scienze e Lettere, vol. 52 (1919), 
pp. 820-830. Hereafter referred to as Bompiani, Surfaces. 


627 


\ 
| 
4 
——______ q 
4 
3 
‘2 
= 


628 V. G. GROVE [July 


(5) B? — 4AC #0 
the non-parabolic case, and the case in which S(2, 0) is a space of five dimen- 
sions and in which 
(6) B? — 4AC =0 
the parabolic case. By proper choice of ¢, y, and A in the transformation 
a d= £= df’, 
in the non-parabolic case, we may write (4) in the form 
(7) = + + Bx, + yx; 
and in the parabolic case in the form 
(8) & = Luu + ax, + Bx, + yx. 

We shall denote by S(3, 0) the ambient space of the three-dimensional 
spaces osculating all of the curves on S through x. The space S(3, 0) is deter- 
mined by the six points (2) and the points 

2. THE NON-PARABOLIC CASE 

If we differentiate z defined by (7) with respect to u and v we obtain the 
following expressions: 
fu = XLuuv + A%uu + + (Qu + + Buty + Yux, 

= + + + + (Bo + + Vox. 
The points #,, #, are in the space S(2, 0) if, and only if, the functions x de- 
fining the surface S satisfy a system of differential equations of the form 
(11) = + + + leu + mx, + dx, 

= + + + + m'x, + dx. 
It follows therefore that in the non-parabolic case S(3, 0) is of dimensions no 


higher than seven. 
Subcase a. Suppose that S(3, 0) is a space of seven dimensions. It follows 


that the functions x satisfy the equations (11) and no other third-order differ- 
ential equations. Under these conditions some of the integrability conditions* 
of system (11) are 

a’ =b=0,ah' =0. 


(10) 


Equations (10) may be written in the form 


* Bompiani, Surfaces, p. 632. 


| 


1934] SURFACES IN HYPERSPACE 629 


= (a+ a)xuu + + B)tuv + au + 7) Xu + (m + Bu)to + (d + 
Ey = (h! +a) Xuv t+ +B) + eu + (m’ + Bo t+) + + 
From (12) we see that the points Zu, Zuv, fv» lie in S(2, 0) if, and only if, 
(13) ate=0, =0. 

Therefore the point & defined by the expression 

(14) = — — + yx 


generates a surface S whose two-osculating space S(2,0) at & coincides with the 
two-osculating space S(2,0) atx. 

From (12) and (14) we find that the expressions for #, and #, may be 
written in the form 
= [a(h — — aut yeu t [d+ — + (h 
= [b'(h’ — a) +m!’ — bs [d’ + — a) Jat 
Therefore the lines g joining corresponding points x and & of S and S form a 
congruence G, and the surfaces S and S sustain C nets* in relation C; the de- 
velopables of G intersect S and S in these C nets. Conversely if two nets are in 
relation C their sustaining surfaces have coincident two-osculating spaces at cor- 
responding points. 

Subcase b. Suppose that S(3, 0) is of six dimensions. By proper choice 
of the notation, the functions x satisfy a system of differential equations of 


the form 
Luuv = + + + leu + mx, + dx, 


(16) = + + + + m'x, + d'x 
but no other third-order differential equations. 
From (7) we find that 

Bu = (a+ @) tun + (h + B)Xuv + bX + + tu + Xu + (m + Bu)xe 
(17) + (d + yu)z, 
= + (h’ + a) + (B + + + tu + (m' + Br + y) 

+ (d’ + yo)x. 
It follows from (17) and (16) that the points Zu, %u., Zoo lie in $(2, 0) if, and 
only if, 


(18) B+b'=0, A(at+a) =0. 


* V. G. Grove, The transformation C of nets in hyperspace, these Transactions, vol. 33 (1931), 
pp. 733-741. 


630 V. G. GROVE 


If we use (18) we may write equations (17) in the form 
By = [+ —a(h— db’) [m—b/ — 0’) 
+ [d+ — v(h — b’) + (h — 
= t+ + a, — + a) + — +a) 
+ [d’ + — y(h’ + a) + + a) 2. 
Some of the integrability conditions of system (16) with )=0 are 
Ad’ =0, @+ah+a,=a'a"+ ah’ +a/ 
b'(h — b') + m— bi 


(19) 


(20) 


A. Suppose first that A ~0, a’ =0. Under conditions (18) equations (19) 
may be written in the form 


= [l—aytyta(h— b’)|xu + [d+ — — 0’) |x + (h — 

= [m’— bf +y4+0'(h' — [d’ +7. — — a) + 
It follows therefore that if A ~0, a’ =0, the surfaces S and S sustain C nets, 
and the lines g joining corresponding points x and & form a congruence G, the 
developables of G intersecting these surfaces in their C nets. 


B. Suppose that A =0. Under this condition another integrability condi- 
tion of system (16) is b’’ =0. Equations (19) may now be written in the form 


fu = (a+ + + — a(h — + [2d + vu — — 
+ (h — 

= + + ay — ah’ + a) + — +7 4+ +a) 
+ [d’ — + a) + (h' + 


(22) 


It follows that the tangent to v=const. on S intersects the osculating plane to 
v=const. on S. The tangent planes to S and S at x and & respectively intersect 
in a point; they will intersect in a line if, and only if, a’ =a+a=0, that is, 
if, and only if, the parametric nets on S and S are in relation C. In this latter 
case the lines joining corresponding points x and # form a congruence. 


3. THE PARABOLIC CASE 


Let us consider the parabolic case. If we differentiate defined by (8) with 
respect to u and v, we obtain 


= Xuuu + + + (Qu + Xu + + Yur, 
= Xuuv + AXuy + + AyXu + (By + + 


It follows therefore that if the points #,, £, lie in $(2, 0) the functions x must 
satisfy a system of differential equations of the form 


(23) 


[July 


1934] SURFACES IN HYPERSPACE 


Xuuu = + hxuv + + lxu + Mx» + dx, 
= a’ + h' xuv + xv» + + m' xX» + 
It follows therefore that S(3, 0) is of dimensions no higher than seven. 
Subcase a. Suppose that S$(3, 0) is of seven dimensions, that is, that the 
functions x do not satisfy a third third-order differential equation. 
The system (24) has the following integrability conditions*: 
b=0,h=0',a,=a/ +l, 
ho tah +l=hi +ah+h? +m’, 
(25) ay’ + +al+hl +70, 
m,+am'+d=mi +a'm+h'm', 
d,+ad’=d{/ +ad+dh’. 
It follows from (23) and (24) that the functions #, and #, are defined by 
the expressions 
(a + + (h + B) Xuv + (1 + Xu + (m + Bu)Xot+ (d + Yu)x, 
if 
iy = Xun + (h’ + a) Xuv + (b’ + B)Xvv + (l’ + Xu + (m’ + Bo + 7) 
+ (d’ + yo)x. 
From (26) we find that the points Zu, fu, Zo» lie in the space S(2, 0) if and 
only if 
(27) ath’ =0, B+0'=0. 


(24) 


Therefore the surface S generated by the point & defined by the expression 


(28) = Luu — h'x, — + yx 


is such that the two-osculating space S(2, 0) at # coincides with the space S(2, 0) 
at x for every choice of y. 
If we make use of equation (28) we may write equation (26) in the form 
fu = ux, + fx + AZ, 
Ey = + wx, + gx + Bi, 


(29) 


wherein 


w= Wa-h')+l—hi +7 ='b' + m' — +7, 
(30) 
g=d+y—-ay, 


* Lane, Surfaces, p. 792. 


631 


632 V. G. GROVE [July 


We may readily verify that as x (#) moves along the curve »=const. on 
S (S) the point 


— px, r #0, 


describes a curve whose tangent at y is the line g joining x to . Moreover 
there exists no other curve on S (S) along which x (#) may move so that the 
line g will generate a developable surface. We may readily verify that the 
lines g generate a congruence G composed of the tangents to a one-parameter 
family of asymptotic curves on the surface generated by the point y. However the 
point y defined by the expression 


y=#—px, r=0, 
is a fixed point, and the lines g form a bundle of lines through this fixed point. 
Subcase b. Suppose that the space S(3, 0) is of six dimensions. 


A. The points xu», %»», as may be seen from (23), will lie in the space 
S(2, 0) if 


(31) B=—h, 
and if x satisfies the equations (24) and a differential equation of the form 
(32) = tun + + + + + 


Some of the integrability conditions of the system composed of equations (24) 
and (32) are 


=0. 
We may readily verify that the point % defined by 
(33) E = — — + yx 


generates a surface S whose two-osculating space S(2, 0) at # coincides with the 
two-osculating space S(2, 0) of S at x. Moreover the tangent planes to S at x 
and § at & intersect in a line h. The projectivity determined on h by the pencils of 
tangent lines to S and S at x and & is parabolic. The lines g joining x to & form 
a congruence of tangents to a one-parameter family of asymptotic curves on a 
surface. 

B. The space S(2, 0) of S at # will also coincide with the space S(2, 0) 
at x if 


=0, 


and if x satisfies equations (24) and a differential equation of the form 


1934] SURFACES IN HYPERSPACE 


Two of the integrability conditions of such a system are 


b=0, =0. 


It follows therefore that any point defined by the expression 


E = Luu t+ arty, + yx 


(a and y arbitrary) in the osculating plane to v=const. on S at x generates a 
surface S whose two-osculating space S(2,0) at # coincides with the space S(2, 0) 
at x. The tangent planes to S and S at x and & intersect in a point. 

Suppose that in the expression (4) A=B=C=0. By a transformation of 
the curvilinear coordinates we may write (4) in the form 


(35) + yx. 

By repeated differentiations we find that S(2, 0) coincides with S(2, 0) if, 
and only if, the functions x satisfy a system of differential equations composed 
of equations of the form (24) and (34). It follows that the space S(3, 0) of S 
at x is of six dimensions. Conversely if the functions satisfy such a system, a 
point # defined by (35) generates a surface of the required type. 


4. THE CONJUGATE CASE 


Suppose now that S sustains a conjugate net. By proper choice of the 
parameters we may take this net to be the parametric net. The functions x 
therefore satisfy an equation of the Laplace type 


(36) Luv = AXy + bx, + cx. 


P It follows from (36) that S(2,0) is a space of four dimensions and that S(3, 0) 
7 is a space of not more than six dimensions. 
Let the point # be defined by the expression 


(37) Axuu + + aty + Bx» + 


wherein not both A and C are zero. 
A. Suppose first that $(3, 0) is of six dimensions. We find readily that 
there exist no surfaces S distinct from S such that the spaces S(2, 0) and 
S(2, 0) coincide. 
B. Suppose that S(3, 0) is of five dimensions. We find from (37) that 
3 (38) iy, = Axuuu + (A + Xun + (6C + Cu)Xvv + [C(a, + a’) + a8 +au+ 
+ [Cle + ab + + 08 + + [C(co + ac) + Be + yulx. 


A symmetrical expression obtains for #,. It follows that if A #0, the func- 


633 
| 
| 


634 V. G. GROVE 


tions x satisfy an equation of the form 
(39) XLuuu = + + + m'x, + 


In order that #, lie in the space S(2, 0), and that S(3, 0) be a space of five 
dimensions the coefficient C must be zero. 


Some of the integrability conditions of the system composed of equations 
(36) and (39) are 


=0, m’=0, 


Hence the curves »=const. on S are plane curves. With the expression for 
#, and C =0, we find that #,, lies in S(2, 0) if, and only if, 8=0. Hence £ lies 
in the plane of the curve v=const. 

If we set A =1, we find that the points #,, #, are defined by the expressions 


fu = +y+au—a(a’ +a) 
(40) + [d’+yu—v(a’ +a) ]x+ + a) 2, 
Ey = (¢ + ab + au + + (6? + bu + ab + 
+ (Cu + be + ac + yo — ay)x + ak. 


The tangent planes to S and S at x and # intersect in a line. Hence if S(3, 0) 
is a space of five dimensions, and if S sustains a conjugate net, the point & de- 
fined by (37) will describe a surface S whose two-osculating space S(2, 0) at z 
coincides with S(2,0) at x if and only if each curve of one of the component fami- 
lies of curves of the conjugate net is a plane curve, and the point & is a point in 
the plane of the curve. The lines g joining x and & form a congruence. 

Suppose that # lies in the tangent plane of S at x, that is, suppose that in 
(37) A=C=0. We readily verify that if $(3, 0) is of six dimensions the space 
S(2, 0) at # cannot coincide with the space S(2, 0) at x for distinct surfaces S$ 
and S. If S(3, 0) is a space of five dimensions, the point # must lie in the 
tangent to one of the curves of the conjugate net, and that family of curves 
is a family of plane curves. 


5. THE ASYMPTOTIC CASE 


Suppose that S sustains a one-parameter family of asymptotic curves. 
Let the notation be so chosen that the curves v=const. are the asymptotics. 
It follows that the functions x defining S satisfy the differential equation 


(41) Luu = aXy + bx, + cx. 


It follows that the space S(3, 0) is a space of six dimensions at most. 


[July 


1934] SURFACES IN HYPERSPACE 


Let & be defined by an expression of the form 
(42) = + + axy + Buy + x, 


wherein not both B and C are zero. 

A. We may readily verify that if S(3, 0) is a space of six dimensions, there 
exists no surface S$ distinct from S with the desired property. 

B. Suppose therefore that S(3, 0) is a space of five dimensions. It follows 
from (42) that the points #, and #, are in S(2, 0) if, and only if, C=0, and 
the functions x satisfy a differential equation of the form 
(43) = + + + m'x, + d'x. 

Two of the integrability conditions of the system composed of equations (41) 
and (43) are 

(44) b=0, c—bdbi + = 0. 

It follows therefore that the surface S is ruled. 

If in (42) we set C=0, B=1, we find that 

Eu = (@+ B)tuv + (dv + aa + + 
(45) + (¢ + Bu)%o + (Co + + 
ty, = (a’ + Q)Xuv + (8 + b’) Xv» + (l’ + Qy)Xy 

+ (m' + By +7) x. 

The points £uu, uv, Zr» lie in S(2, 0) if, and only if, 8 = —b’. Equation (45) 
may be written in the form 

ty = [a, t+ aatauty—ala— b’) |xu + [co tact yu — b) |x 

+ (a _ b’) x, 
(46) 
& = [l’ + a, — a(a’ + a) + [m’ — bf + 74+ +) 
+ [d’ + — y(a’ + + (a’ + 
The point & defined by the expression 


= Xuv + — + yx 


for arbitrary values of a and y generates a surface S whose two-osculating space 
S(2, 0) at # coincides with the two-osculating space S(2,0) of S at x. 

The point r defined by the expression r=x,—b’x is readily characterized 
as the only point, on the generator through x of the ruled surface, describing 
a surface for which the osculating plane to the curve u=const. at r lies in 
the space of three dimensions tangent to the ruled surface along the generator 
through x. We find that 


635 
i 
3 


V. G. GROVE 


tar — (ab’ + 


It follows that the lines g joining x to & form a congruence. The line g passes 
through x and intersects the tangent line to the curve u=const. on the surface 
generated by the point r. 

Suppose that # lies in the tangent plane to S at x. We readily verify that 
S(2, 0) at # will coincide with S(2, 0) at x if and only if z lies in the tangent 
line of the asymptotic curve on S through z, and if the functions x defining 
the surface satisfy a differential equation of the form (43). 


MICHIGAN STATE COLLEGE, 
East LAnsinc, Mica. 


636 


THE SOLUTIONS OF THE MATHIEU EQUATION WITH 
A COMPLEX VARIABLE AND AT LEAST 
ONE PARAMETER LARGE* 


BY 
RUDOLPH E. LANGER 


Introduction. The Mathieu differential equation 


du 
(1) — + {A — Qcos 2z}u = 0, 
dz? 


also commonly known as the equation of the elliptic cylinder functions, is 
too well known to require any introduction. Its solutions govern problems 
of the greatest diversity in astronomy and theoretical physics, and have 
accordingly been the subjects of a vast number of investigations. 

The differential equation as such depends upon two independent param- 
eters, designated in the form written above by A and 2. In the present dis- 
cussion these are to be taken real but are to be numerically unrestricted 
except that at least one is to be large. The variable will be permitted to range 
over the complex plane. 

Since the coefficient of the differential equation is an even simply periodic 
analytic function of z, it is known from Floquet’s theory of such equations 
that the solutions are in general of the structure 


u(z) = ce“*p(z) + 2), 


in which the function ¢(z) is periodic. The characteristic exponent, wu, is a 
constant as to z but depends in an intricate way upon the parameters A 
and ©. If it is real, the equation obviously possesses a solution which for 
large real values of the variable becomes exponentially infinite, i.e., a so 
called unstable solution. In the alternative case the exponent is pure imagi- 
nary and the solutions remain bounded along the axis of reals, i.e., are of the 
so called stable type. The intermediate case in which » =0 is of especial im- 

* Presented to the Society, April 6, 1934; received by the editors February 12, 1934. 

} Cf. for the literature and for partial enumerations of applications of the equation: Strutt, M. J. 
O., Lamésche-Mathieusche und ver die Funkti in Physik und Technik, Ergebnisse der Mathe- 
matik und ihrer Grenzgebiete, vol. 1, No. 3, Berlin, 1932; Whittaker and Watson, A Course in Modern 
Analysis, 3d edition, 1920, Cambridge University Press; Humbert, P., Fonctions de Lamé et Fonc- 
tions de Mathieu, Mémorial des Sciences M4thematiques, X, Paris, 1926; Van der Pol, B., and Strutt, 


M. J. O., On the stability of the solutions of Mathieu’s equation, The Philosophical Magazine, vol. 5 
(1928), p. 18. 


637 


|| 


638 R. E. LANGER [July 


portance, for the equation then admits one solution known as a Mathieu 
function which is periodic. The second solution, a Mathieu function of the 
second kind, is then not periodic and is of a functional structure distinct from 
that indicated above. 

With either of the parameters A and © fixed, the relation » =0 restricts 
the remaining one to a denumerably infinite set of values called the charac- 
teristic values. Broadly speaking the determination of these values and of 
the corresponding Mathieu functions is the matter of prime importance in 
the applications of the equation which belong more immediately to the do- 
main of physics, while the determination of the characteristic exponent in 
terms of a fixed set of parameters is generally the peculiar requirement of the 
applications to astronomy. 

When the values of the parameters are small the solution of the differen- 
tial equation is generally and appropriately essayed through the means of 
convergent series expansions. When at least one of the parameters is large, 
on the other hand, the methods of asymptotic representation are adapted 
and have been generally applied. Though the literature covering investiga- 
tions of this latter type is large it can hardly be said that the results recorded 
are by any means complete. Restrictions upon the range of the parameters 
are generally made and frequently only the forms of the Mathieu functions, 
i.e., of the solutions with the period 27, are considered. Again, when forms 
asymptotic with respect to one parameter are obtained their dependence 
upon the remaining secondary parameter may not be considered, the results 
being established, therefore, only for a fixed configuration of the parameters 
relative to each other. Finally the investigations have almost exclusively 
been restricted to the case of a real variable. The most recent report on the 
status of the theory* says on this point: “While we believe that the theory 
of the Hill and Mathieu differential equations with real variables and pa- 
rameters has to a certain extent been rounded out, it is to be emphasized that 
no such assertion can be made concerning these equations with complex 
variables and parameters. ... Only when the problems bearing upon this 
point have been adequately treated may it be hoped to round out the theory 
of the Lamé equation as has been done in the case of the equation of Mathieu. 
Such an investigation would not only throw new light upon many differential 
equations of mathematical physics, but would make possible the application 
of certain of the functions obtained to problems of practical importance.” 

The present investigation is devoted to a general consideration of the 
asymptotic solutions of the Mathieu equation over the complex plane and 
for all real configurations of the parameter values in which at least one is 


* Strutt, loc. cit. (Vorwort). 


1934] THE MATHIEU EQUATION 639 


numerically large. The analytic forms which represent the solutions asymp- 
totically are found to differ in essentially different parameter configurations, 
while in its dependence upon the variable such a representation even for a 
specific solution and with one and the same configuration of parameters 
requires the employment of a variety of analytic forms. In general a special 
form is required for the description in the neighborhood of any point in which 
the coefficient of the equation vanishes, while outside such neighborhoods 
several forms again are made necessary by the incidence of the Stokes’ 
phenomenon. 

The limitation of the discussion to real parameter values was imposed to 
keep the extent of the investigation within its present bounds. The method 
in no way requires such a restriction.* In the matter of the method the pres- 
ent paper is based upon earlier papers of the author? which gave a general 
derivation of the asymptotic solutions of differential equations of the type 


@) + + x2(¢, 0) }u = 0, 


in which p is a large complex parameter and the coefficient x? (z) vanishes at 
some point of the domain considered. Aside from the considerations peculiar 
to the Mathieu equation, however, the presence of two independent param- 
eters makes of the present discussion something more than a specialization 
of the general theory cited. With one parameter assigned to a primary role 
it must be shown that the hypotheses of the theory cited are met uniformly 
with respect to the secondary parameter which has remained free. This is 
essential to assure the uniform validity of the conclusions, i.e., that the 
degree of approximation afforded by the asymptotic representation is main- 
tained during a variation of the parameters within the bounds of a given con- 
figuration. 

By way of arrangement there have been grouped in chapter 1 such gen- 
eral considerations as are to be subsequently available. Of the following 
chapters each is given to the deductions peculiar to a specific configuration of 
parameters. Throughout the paper the forms of two fundamental pairs of 
solutions are deduced. This is desirable because of the fact that the members 
of any one pair of solutions may and do become asymptotically indistinguish- 
able in certain regions of the complex plane. Aside from the general asymp- 

* An analogous application of the method to a study of the Bessel functions with both the varia- 
ble and the parameter complex was made by the author in the papers cited below. 

t These Transactions, as follows: On the asymptotic solutions of ordinary differential equations, etc., 
vol. 33 (1931), p.23; On the asymptotic solutions of differential equations, etc., vol. 34 (1932), p. 447; 


The asymptotic solutions of certain linear ordinary differential equations of the second order, vol. 36, 
p. 90. These papers will be referred to in the text by the designations L;, L2 and Ls. 


640 R. E. LANGER [July 


totic forms the special forms which apply to real values of the variable are 
noted, and the forms of the solutions of the associated Mathieu equation, 

d*v 
(2) {Q cosh — A}v = 0, 

dz? 
are deduced. The asymptotic equations for the characteristic values are given, 
and the characteristic exponent is asymptotically determined. 


CHAPTER 1 
GENERAL CONSIDERATIONS 


1.1. The parameter configurations. The effect of replacing the variable z 
by z+7/2 in the equation (1) is merely to alter the sign of the cosine function, 
i.e., to replace the parameter © by its negative. There is, therefore, no loss 
of generality in assuming, as will henceforth be done, that © ranges only over 
the positive values and zero. The parameter A, on the other hand, is to range 
unrestrictedly over all real values. 

For any positive 2, however small it may be, the term 2 cos 2z becomes 
dominant over A when z reaches a domain sufficiently remote from the axis 
of reals. In any such domain therefore the character of the differential 
equation is essentially altered if 2 is replaced by zero, and it may accordingly 
be expected that formulas which are to be valid uniformly for 220 may be 
obtained only for regions of the z plane in which |8(z)| is bounded. This fact 
suggests the grouping into separate configurations of those sets of parameter 
values in which 0 is relatively small. They are indicated as II and IX in 
Figure 1 below, the precise specifications to be later determined. 

When >0, the function {A—@ cos 2z} vanishes at an infinite set of 
points in the complex plane. As z moves at a suitable distance about any 
such point the asymptotic forms which represent a given solution of the dif- 
ferential equation must be altered, i.e., replaced by others, at certain speci- 
fiable intervals. This so called Stokes’ phenomenon depends quantitatively 
upon the order of the zero which is encircled, and since this order changes 
from the first to the second when © and |A]| become equal, it may be ex- 
pected that results obtained on the assumption that the parameters are suffi- 
ciently different in numerical value may not remain uniformly valid when 
these values are allowed to approach equality. This fact serves as the moti- 
vation for considering as distinct configurations those indicated in Figure 1 
by the designations IV and VII, in which the parameters numerically ap- 
proximate each other. They will be precisely defined at appropriate points 
in the discussion which follows. The division of the half-plane of the coordi- 
nates (A, Q) into configurations is, therefore, such as is indicated in the 


1934] THE MATHIEU EQUATION 641 


figure, the hypothesis that at least one parameter be large having the effect 
of excluding from consideration a neighborhood of the point O. 

1.2. The hypotheses of the general theory. The differential equation (1) 
may be transformed in a variety of ways into an equation of the general form 


d*u 
(3) as? + { (s, a) + pxi(s, a) = 0, 


in which p, the primary parameter, and o, the secondary parameter, are 
expressible in terms of A and Q. The particular substitutions and hence the 
particular equations which result are to depend upon the parameter configur- 
ation which obtains, and will therefore be made at appropriate points as the 


discussion proceeds. 
II 
Ill 
IV 
Vv 


4 
Fic. 1 


Equations of the type (3) in which, however, the parameter o is absent 
(i.e., fixed) are familiar, the asymptotic forms of their solutions having been 
deduced* under hypotheses which for the present purposes may be enumer- 
ated in the following way: 

(i) The range of the complex variable s is to be a region R, in which the func- 
tions 

(s — So)~’x0?(s) and x;(s) 
are analytic, so being some point of R, and v being some real non-negative con- 
stant. Except in some fixed neighborhood of s» the several functions 


are to be bounded. 


* Papers L2 and L; cited above. In the formulas of paper L; the variables \, xi, ¢, and & must be 
replaced by ip, ix1, 2¢ and 27 respectively in order that they may appear as given here. 


642 R. E. LANGER 


It is convenient to have at hand the following definitions: 
— ixi(so) ix,(s) 2kx0(s) 
4x0'(So) xo(s) 
xods 


k= 


(5) 


It follows then, as may be shown, from the hypothesis (i) that the functions 
3 1 /¢” viv+4 2 
2\¢e/ 4(+2)?\e 


f nds 


are continuous in the region R, inclusive of the point s=so. A second and 
third hypothesis* made are the following: 


(ii) The differential equation (3) is to be in normal form, i.e., such that 
either x,=0 or else v=2 and 


v= 


{3xd xi — 2xd’x1} me = 0. 


(iii) Either the region R, is to be bounded, or else there are to exist constants 
M and H such that the relations 


< M, <M 


are satisfied for all arcs of integration in R, on which |s—so| >H and on which 
varies monotonically with | . 

When the secondary parameter g is not fixed but is permitted to vary, the 
formulas to be taken from the theory cited will be valid uniformly only if the 
hypotheses stated are satisfied uniformly with respect to o. Specifically the 
functions (4) must be uniformly bounded in R,, the functions (6) must be 
uniformly bounded in any fixed finite part of R,, and the hypothesis (iii) 
must be fulfilled with constants M and H which are independent of c. 


* The hypothesis (iv) of papers L2 and Ls is not repeated here. It is obviously satisfied in every 
case of the present discussion. 


[July 
kxé 
7? 
(6) = + 1 
® xods 
80 


1934] THE MATHIEU EQUATION 643 


1.3. The solutions. When the equation (3) satisfies the several hypotheses 
and the primary parameter p is sufficiently large, the relation defining the 
variable determines a map of the region R, upon a corresponding region 
R; in the complex é plane. This map is conformal except possibly at the point 
corresponding to so where, if v0, the region R; has a branch point whose 
order depends upon p. 

The relations 


with / an integral index and ¢ an arbitrarily small but fixed positive constant, 
define in the domain R; the (overlapping) sub-regions =“. These correspond 
to respective sub-regions of R, which will likewise be denoted by =. 

For any index h the differential equation (3) possesses a fundamental 
pair of solutions ~),:(s), #a,2(s), which are characterized by the fact that they 
are of peculiarly simple asymptotic forms as compared with the general 
solution for values of s which are in the corresponding sub-region =) and 
which are not too near the point so. When s passes the bounds of the sub- 
region = this simplicity is lost and devolves upon a new set of solutions 
which are in turn associated in the manner indicated with the new sub-region 
in which s is then to be found. If v~0 the forms referred to give valid repre- 
sentations of the respective solutions only so long as |£| >, where N is a 
constant whose magnitude is determined by the degree of approximation 
which the asymptotic representation is required to afford. The excepted 
region || <N corresponds in R, to a neighborhood of the point so, and in 
this region a distinct representation must in general be employed. 

The solutions ~,,;(s),7=1, 2, with a particular index / are thus because 
of their simplicity especially adapted for use in any deduction in which the 
associated region = plays a peculiar role. In terms of them, however, any 
other solutions may be simply expressed. In particular, it will be noted that 
if the point z. corresponds to s, under the correspondence of the variables 
which relates the equations (1) and (3), then the principal solutions u(z), 
U(z), of the equation (1) relative to z,, i.e., those determined by the values 


du(Za) _ dU (Za) 


(8a) u(za) = 0, =1, U(z) =1, 0, 
dz 


are given by the formulas 
(=) — Un 


ds W 


(8b) 


W 


644 R. E. LANGER [July 


in which k may be any index, the primes denote differentiation with respect 
to s, and W designates the Wronskian 


W = — 
which is a constant. 

The principal solutions relative to the origin (z,=0) will be designated 
throughout the discussion by ~,(z) and u,(z). Inasmuch as the coefficient of 
the differential equation is an even function, they will be respectively odd 
and even functions of z as is to be indicated by the subscripts chosen. The 
principal solutions relative to the point 2,=7/2 will be denoted by «,(z) 
and w(z). 

1.4. The asymptotic solutions when y=1. The special case of most fre- 
quent occurrence in the discussion which follows is that in which v =1, i.e., 
in which the zero of the coefficient x?(s) is a simple one. It is convenient, 
therefore, to note at this point for general reference the specific formulas 
which then apply in the relations of the preceding section, in so far as they 
are later to be used. Thus, for k= —1, 0, 1, 2 the solutions u,,;(s) are de- 
scribed by the following formulas: 

When |é| and sis in 


h,l 


h 
(9a) Una(s) = Asie + }, = 1, 2, 


with coefficients to be obtained from the following table: 
(h, | (—1, —1)] (—1,0)| (—1, 1)]] ©, —1)] ,0) | ©, 1) |] —1)] 4,0) | 4,1) 
Ati [1] [1] 
i [-i] 


A}! 0 


Ab [1] 0 


and, when |¢| <N, 


(A) 


with the coefficients 
—1 


(2, —1)} (2,0) | (2,1) | (2,2) 
0 o | (1) | [1] 
[-i]] [-é] | [-é]} 0 
| 0 1 2 : 

ri/3 a ri/3 1 


1934] THE MATHIEU EQUATION 645 


The symbols J in these formulas designate Bessel functions in the familiar 
manner, and the symbol [ |] will be used throughout the discussion in the 
sense that [Q] designates a quantity which differs from Q by terms of the 
order of p~! and of the order of N-! uniformly in oc. 
From formulas thus given the evaluations 
[1 Je [1 


Une(Sa) = 


Unr(Sa) = 
pilegil2 


when |£,| =N and & isin =”, and W = [2i]p”*, will be immediately noted. 
Direct substitution in the relations (8b) leads, therefore, to the following 


formulas: 
When =>N,zisin and zisin 


1|/dz 


2i \ds 


— + Ass 


(11a) 


and when |¢| <N and z, isin =, 


dz bed (h) (h) 


Tha\'!? 


From these forms certain terms, depending upon the indices, may under cer- 
tain conditions be omitted as asymptotically negligible in comparison with 
others. The precise evaluations will be deferred to the points where applica- 
tions of the formulas are to be made. 

1.5. The “‘associated”’ Mathieu equation. The associated Mathieu equa- 
tion (2) is obtainable from the equation (1) by substituting in the latter zz 
in place of the variable z. Its solutions may, therefore, be derived from those 
discussed above by this simple change of variable. In particular it may be 
observed that the principal solutions relative to the origin, to be denoted by 
v(z) and 2,(z), are respectively odd and even functions of z, and that they 
are given by the formulas 


646 R. E. LANGER 


vo(z) = — iu,(iz), 


(12) 


ve(z) = u,(iz). 


1.6. The solutions for general values of z. The hypotheses stated in §1.2 
under which the forms of the solutions of the equation (1) are obtainable 
through the medium of the equation (3) restrict the variable to a region R, in 
which the coefficient (A— 2 cos 2z) has at most one zero. It will be found in 
the subsequent discussion that this region over which the forms are directly 
deducible is in each case either the strip 


(13) where z= x iy, 


or some closely related domain. It remains, therefore, to consider the exten- 
sion of the asymptotic representations over the remaining parts of the z plane. 
A method by which this may be done is to be outlined as follows. 

Since the coefficient of the differential equation is an even periodic func- 
tion with the period 7, the function u(mw—z) is a solution whenever u(z) is 
such and wm is an integer. Hence each member of the several relations 


(a) = — — 2) + — 2), 
(14) (b) u(z) = — — 2) + — 2), 

(c) ~ Uo(2) = — — — 2), 

(d) u(z) = — 2) — (x/2)ua(x — 2) 


is a solution of the differential equation. The identities are established, 
therefore, by the fact that in each relation both members and likewise their 
derivatives take the same values at the point z=72/2. A similar comparison 
of values at the point z= 2’, whatever the integer p, establishes the further 
relations 


(a) Uo(Z) = — — + — 
(b) Ue(Z) = — z) + (2?r)u.(z — 27m). 
Let it be supposed now that the forms of the solutions have been deduced 
and so are known for all values of the variable which lie in the strip (13). It 


is to be shown then by the method of induction that they are deducible over 
the strip §, where # is any integer and §, is defined by the relation 


(16) Sp: OS « 


(15) 


To begin with, let z lie in the region So. Then either z or 7—z lies in the 
strip (13). In the former case the representations of u,(z) and u,(z) are known 
by hypothesis, whereas ir. the latter they are given by the identities (14) in 
which the forms of the right-hand members are known. Proceeding, let the 


[July 

|| 
| 
4 
4 


1934] THE MATHIEU EQUATION 647 


representations be considered known in the region §, with any specific p, 
and let z lie in the strip §,,:. Then either z lies in §, and the forms are already 
known, or else both the values (2°+!2 —z) and (s—2?7) lie in §, and the forms 
of the right-hand members of the relations (15) are known. In the latter event 
the identities furnish the representations sought in the part of §,4: not in- 
cluded in §>. 

Finally the odd and even functional characters of the solutions u,(z), 
u,(z) may be drawn upon to extend their representations into the left-hand 
half-plane, and with the forms of these solutions at hand the representations 
of u(z) and ug(z) may be drawn from the identities (14). 

1.7. The characteristic values. With any given value of { there are 
known to be associated specific characteristic values of A for which the dif- 
ferential equation (1) admits a periodic solution with the period 27. These 
periodic solutions are enumerable, and are each either an odd or an even 
function of z.* With a scheme of enumeration which will become clear as the 
subsequent quantitative discussion proceeds, the characteristic values for 
which the odd solution u,(z) has the period 27 will be denoted by S,(Q), 
while those for which the period occurs in the even solution u,(z) will be 
designated by C,(). The equations of which these values are the roots are 
called characteristic equations. 

Consider the characteristic equations for the values S,(2). From the 
identity (15a) it is seen at once that a necessary and sufficient condition that 
2m be a period of u,(z) is that ,(7) =0, an equation which in virtue of the 
relation (14c), with z=7, may be written 


us (x/2)ua(0) = 0. 


If the root in question is one for which the factor u,’(7/2) vanishes, it follows 
from the identity (14c) that ~,(z) admits no smaller period than 27. On the 
other hand, if the root is one for which u,(0) is zero, then the solutions u,(z) 
and u,(z) are linearly dependent. It follows that u,(z) vanishes at z=7/2, 
and hence from the relation (14a) that u,(z) admits the period 7. With the 
enumeration to be chosen the characteristic equations for odd periodic solu- 
tions are accordingly the following: 


(a) u(x/2) =0, roots S2n(Q), 
u.(z) periodic with the primitive period 7; 
17 
a7) (b) u(x/2) = 0, roots Sen4:(2), 


u.(z) periodic with the primitive period 27. 


* Cf. Whittaker and Watson, loc. cit., §19.2. 


Ft 

| 


048 R. E. LANGER [July 


The characteristic equations for even solutions may be similarly deduced. 
Thus from the identity (15b), with p=0, the condition that 27 be a period of 
u,(z) is seen to be u,’(7) =0. From the derived relation (14b), taken at z=7, 
the condition is found to be 


u.(x/2)ug (0) = 0. 


If for the root in question u,(7/2) is zero, the identity (14b) shows that a 
smaller period than 27 is precluded. In the alternative the factor u’(0) is 
zero, u,.(z) and us(z) are dependent and hence u,’(z) vanishes at z=7/2. It 
follows from the relation (14d) then that u,(z) admits the period 7. In this 
instance, therefore, the characteristic equations are 


(a) ue(r/2) = 0, roots C2,(Q), 


u.(z) periodic with the primitive period 7; 


(18) 
(b) u.(r/2) = 0, roots Cen4:(2), 


u.(z) periodic with the primitive period 27. 

1.8. The Mathieu functions. When A is a characteristic value S,(Q) or 
C,(Q), the corresponding periodic solution u,(z) or u.(z) is after suitable 
normalization known as a Mathieu function, and is respectively designated by 
se,(z, ©) or ce,(z, 2). Two modes of normalization have been commonly 
employed. The first* uses the stipulation that the coefficients of sin mz and 
cos nz in the respective Fourier expansions of se, (z, 2) and ce,(z, 2) be unity, 


i.e., 


1 
—f se, (x, 2) sin nx dx i. 


ce, (x, 2) cos nx dx = me a= 
0 for n ¥ 0. 


Since the integrands in these relations are even functions, the intervals of 
integration may, of course, be reduced to (0, 7). It may, however, be further 
observed that in virtue of the equations (17), (18), and (14), 


se, (2, 2) = (— 1)"*' se, (x — z, Q), 
ce, (2, 2) = (— 1)" ce, (x — 2, Q), 
i.e., the Mathieu functions are each either even or odd in the variable 


z—m/2. The ranges of integration above may, therefore, be reduced further 
to (0, +/2), the formulas which result being 


(19) 


* Cf. Whittaker and Watson, loc. cit. 


THE MATHIEU EQUATION 


uo(x) sin nx dx 
0 A=S,,( 2) 


ce, 2) = 


a/2 
(4— f cos nx dx 
0 


4=C,,(2) 


A second mode of normalization* is based on the requirements 


1 1 
—f (x, Q)dx = 1, —f ce? (x, Q)dx = 1+ down. 


In this case the formulas obtained are 


m/249(z) 


1/2 
2 (f ug 
0 A=S,(2) 


1/2 
21—50,n/2 ( f ue 
0 A=C,,(2) 


1.9. Other periodic solutions. The characteristic equations for values of 
A which yield periodic solutions with periods other than 7 or 2x may be 
deduced by considerations similar to those of §1.7. The identities 
(a) u(z) = — u(2pr — 2) + 2u(pr)ue(pr — 2), 
(b) u(z) = u(2pm — z) — — 2), 
(c) u(z) = — u((2p — 1)4 — 2) + 2u((p — 3)4)ua(pr — 2), 
(d) u(z) = u((2p — 1) — 2) — 2u'((p — — 2). 
are easily verified when p is any integer and u(z) is an arbitrary solution of 
the differential equation. With the use of them it can be shown, as is outlined 


below, that periodic solutions with the periods indicated occur for values of 
A which are roots of the respective equations 


se, (2, 2) = 


’ 


22) 


(23) u(nxr/2) = 0, odd solutions with period uz, 


us (nr/2) = 0, odd solutions with period 2nz, 
(24) ue (nx/2) = 0, even solutions with period mz, 
u(nx/2) = 0, even solutions with period 


* Cf. Strutt, loc. cit. 


1934] 649 
=| 
(20) 
— 
q 
| 
4 


650 R, E. LANGER [July 


Moreover, if ” is the smallest integer for which an equation is satisfied, the 
period indicated is primitive. 

Consider the equations (23). Their sufficiency for the indicated periodici- 
ties may be verified by observing that they imply through the pertinent 
identities (22) respectively that u.(z-++-nm) = +u,(z). Conversely, if 2mm is a 
period of u,(z), then u,.(mr) =0 by the identity (22a), and this leads when n 
is even through the relation (22b) to the one or the other of the equations 
(23). If is odd the result follows from the identities (22c) and (22d), to- 
gether with the fact that at least one of the solutions u,.(z) and us(z) must dif- 
fer from zero at the point z=(m+1)7/2. 

The necessity and sufficiency of the equations (24) for solutions of their 
associated types is proved similarly, though in some instances the identities 
(22) must be differentiated prior to their application. 

1.10. The characteristic exponent. When the parameters A and Q are 
both fixed the differential equation in general admits no periodic solution. 
In this case it is known from Floquet’s theory of differential equations with 
simply periodic coefficients that there are two solutions of the forms 


e“*o(z), and 2), 


in which ¢(z) is a periodic function with the period 7, while yp, the so called 
characteristic exponent, is a constant which depends upon A and Q. The 
equation for p is* 


e? — 20e% + 1 = 0, 


whence 


1 
(25a) = —cosh-!0 = — cos“! 0, 


with 0 =u,(7). The alternative evaluation 
(25b) © = 2u.(7/2)ug(0) — 1 


may be obtained from the relation (14b). 

It is evident that yu is either real or pure imaginary according as 0 >1 or 
@<1. In the former case the solutions noted above become infinite near 
the one or the other extremity of the axis of reals and are called unstable; 
in the latter case they remain bounded for real values of z and are called 
stable. 

1.11. Certain elliptic integrals. It will be found now and again in the dis- 
cussion which follows, that the comparison and identification of certain 


* Cf. Horn, J., Gewohnliche Differentialgleichungen, Leipzig, 1905, p. 242. 


4 
4 


1934] THE MATHIEU EQUATION 651 


superficially dissimilar formulas will depend upon the approximate or asymp- 
totic evaluation of certain elliptic integrals of the type 


1 —rsin?¢ 
(26) G(r, = f 
0 


1— 


The value of / will in every case be either near zero or near 1, and 7 will be 
either 1 or h’. 
In terms of the standard complete elliptic integrals 


dt 
1 — sin? ¢} dg, 


2 sin? 1/2 


it is evident that 
G(r, = K + K). 


Hence on substituting for these integrals their expansions in powers of f, it is 
found that when /? is nearly zero 


h? 
G(1, h?) = + wou), 

(26a) 
G(h?, h?) = “41 + wow) 


On the other hand, when /? is nearly 1 the Landen Transformation* 


h sin = sin (2¢ — £) 


(1 
— — cos 
4 h h 


cos t{1 + tan? ¢} 1/2 


yields the form 


—T 2 
G(r, + 
0 


in which 
1+h) 
2 i+h 


The quantity e? tan? / is uniformly small of the order of e. Hence the radical 
may be replaced by its binomial expansion, whereupon the integration leads 
to the formula 


* Cf. Hancock, H., Elliptic Integrals, New York, 1917, p. 84. 


652 R. E. LANGER 


—T 2 27 sint, — 7) sin 
G(r, = — + ( \( 
h i+h h 4h cos* 


4 on +1 (2) 
{ =) 8 


For the special values of 7 this reduces to 


i+h With 8 


G(1, = —+- 
7 h h 
(26b) 


2 h 
= — h + 2h log «). 


CHAPTER 2 
THE CONFIGURATION II 


2.1. The differential equation. When the relative values of the parameters 
A and Q are such that the point (Q, A) in Figure 1 lies in the region II at a 
sufficient distance from O, i.e., more specifically when A is large and positive, 
and with a constant M, (to be specified below) the relation 


1 
2.1 —A 
(2.1) i, 


is fulfilled, the substitutions 
(2.2) 
give to the equation (1) the form (3) with 


xo = ¢, xi = 0, 


(2.3) 
¢? = 1 — o* cos 2s. 


Let the variable z be restricted to any finite region of the complex plane. 
Then a number M, may be determined such that for all admitted values of z 


M, 


1 
(2.4a) S > —-» z=x+iy. 


The constant M; of the relation (2.1), which determines the parameter values 
to be included in the present configuration, is to be one with which the 
condition (2.4a) is fulfilled. The primary parameter p is to be thought of as 


* The distinction between s and z, which in the present instance is non-existent, is drawn for 
the purpose of making the formulas suvsequently useful in a case when these variables are not the 


same. 


[July 
| 
= | 
2 
| 


1934] THE MATHIEU EQUATION 653 


bounded below but not above, and the secondary parameter ¢ is evidently 
restricted to the range 


(2.5) 


The relation (2.4a), together with 


(2.4b) 
defines a strip of the z plane which is to be designated as R,. The correspond- 
ing domain of the variable s is 


(2.6) R,: OSs'S72/2; |s’’| < 3 s =s' + is”. 


This region includes the origin and it is readily verified that with s)=0 the 
hypothesis (i) of §1.2 is fulfilled uniformly in ¢ with y=0. The hypotheses 
(ii) and (iii) are likewise fulfilled, since x,=0 and R, is bounded. From the 
formulas (5) it is seen that in the present instance 7(s)=w;(s)=k=0, in 


consequence of which 
1 S(1 — o) 


These functions are bounded uniformly in o and hence the requirements 
enumerated in §1.2 are completely fulfilled. 

2.2. The solutions. Since the case in hand is one in which vy =0, there exist 
solutions of the differential equation which maintain a single asymptotic 
form over the entire region R,. Such solutions with their respective forms are 


uo(s) = [1], 
Uo 2(S) = [1 ] 
Their Wronskian has the value W = [2i]p. The principal solutions relative to 


the point z=0 are accordingly computed directly from the formula (8b), 
with h=0, s,=0, to be 


u(z) = 


» 


(2.7) 


ule) = + 


{A — Qcos poi = {A — 


f {A — Qcos 22} 
0 


1 
| 
: (2.8) 
with 
(2.8a) 
| 


654 R. E. LANGER 


Inasmuch as 
— e~*[1] = sin [€], 


with analogous formulas involving the other trigonometric functions, it is 
seen in particular that for real values of the variable 


u(x) = sin {A — Qcos 2x} | 


(2.8b) {(A — 2)(4 — Qcos 2x)} 1/4 


u(x) = [1] cos| {A — 22} | 


A — Qceos 2x 


The principal solutions relative to z= 7/2 are similarly found to be given 
by the formulas 


Ua(Z) = e~i(t-&) [1] } , 


(2.9) 1 ( de) 1/2 


with 

(2.9a) poe = {A+ {A — Qcos 22} */2dz. 
x/2 

When z is real they are 


Ua(x) = — sin [ft — 2x} 


{(A + 2)(A — 2x)} 


ug(x) = cos [ fe — Qos 2} 


— Qcos 2x 


z 


(2.9b) 


In the special case that o=0 (i.e., 2=0) the differential equation (1) 
is directly integrable, and it is verified immediately that the formulas above 
are correct when the symbols [ | are omitted. It may be concluded, there- 
fore, in the discussidn of this chapter that the quantities [1] reduce to 1 
when o?=0. 

2.3. The solutions of the associated Mathieu equation. The principal solu- 
tions of the associated Mathieu equation (2) relative to the origin may be 
derived from the functions (2.8) by the substitutions (12) as was noted in 
§1.5. Their forms so obtained are 


= sinh | ft — Qcosh 22} és, 


— 2)(A — @ cosh 22)} 1/4 


A-Q 1/4 z 
= [1] cosh | f {A — Qcosh 22} 
A — Qcosh 22 0 


(July 
(2.10) 


1934] THE MATHIEU EQUATION 


the region for z being 
1 M, 
cosh~! 


—r/2sys0. 


The solutions (2.10) are evidently asymptotically multiples of each other 
when 2 is real and large. A pair, v,(z), vs(z), not subject to this disadvantage is 
that obtainable by the substitution of iz for s from the functions (2.7). Their 


forms are explicitly 


v,(2) = |- — Qcosh 


{A — Q cosh 22} 1/4 


(2.11) 


1 z 
= fa — 2a} f {A — Qcosh 25} ds. 


2.4. The characteristic values. If S,(2) and C,(Q) are a pair of charac- 
teristic values, the substitution of the forms (2.8b) into the characteristic 
equations (17) and (18) shows that each of these values is a root of an equa- 
tion 


off nT 
(2.12) f {A — Qcos | 
0 


with the integer m suitably adjusted to p or g as the case may be. To determine 
this adjustment, it need merely be observed that when 2=0 the equation 
reduces to A=n?, and the corresponding Mathieu functions to sin mz and 
cos mz. Since these are by definition the forms of se,(z, 0) and ce,(z, 0), it 
must be concluded that p=m and g=n, i.e., the form (2.12) is that of the 
characteristic equation both for S,( 2) and for C,(). 

The symbol [ ] in the equation (2.12) represents a quantity of the 
order of A-/? uniformly in ¢, which vanishes when o =0. Since it like the 
equation (1) depends analytically upon o?, the equation (2.12) may be 
written 


f {A — Qcos 2x} + o20(A-"2) = 
0 


The substitution x =2/2—¢ reduces this to 


nT 
ht) + = 


655 
J 


656 R. E. LANGER [July 


where G is the elliptic integral of (26) with h? =207/(1+07). Since this value 
of h? is small, the evaluation (26a) gives to the equation the form 
AV2{1 + ¢4O(1)+ = n, 
from which it follows that 
Q 
S,(2) = n* + —O(1), 
n2 
(2.13) ‘ 
C,(2) = n? + — O(1), 
n2 


the quantities indicated by the symbols O(1) being uniformly bounded as to 
n and Q while the configuration with which the present chapter deals is 
maintained. 

2.5. The characteristic exponent. The substitution into the formula (25b) 
of the values given by (2.8b) and (2.9b) yields the evaluation 


© = [2] cos [é] cos [t] — 1 
cos 2f + o?O(A-!/2), 


Accordingly, from (25a) an asymptotic formula for the characteristic ex- 


ponent is 
Q?2 
ou} 


1 
(2.14) — cos“ {cos( f 2{A — + 
T 0 


When 2=0 this reduces to » =iA?, a result which may be verified by actual 
integration of the differential equation. 

Inasmuch as the quantity within the brace in the formula (2.14) does not 
exceed unity, except possibly for very small ranges of the parameters near 
those values for which the integral is a multiple of 7, it follows that the 
configuration under consideration in this chapter is predominantly one of 
stable solutions.* 


CHAPTER 3 
THE CONFIGURATION III 


3.1. Definitions. The parameter configuration contiguous with that of 
the preceding chapter and designated by III in Figure 1 is to be defined by 
the relation 


(3.1) A<Q<A-—M,A, 


* Cf. the Figure 3 in Strutt, loc. cit. 


: 
¥ 
2 
3 
3 
| 
| 
| 
| 
| 
| 
] 
2 
3 
= 


1934] © THE MATHIEU EQUATION 657 


in which M, is the constant in (2.1), and M; is to be momentarily discussed. 
The substitutions 
A-2 Q — is 


s= 
Aue A 


(3.2) p= 


reduce the differential equation (1) in this case to the form (3) with 


xo = 9, xi = 0, 


(3.3) $? = 2(1 — 0?) sinh? os 


1. 
o2 

The parameter p is evidently restricted by the relation p= Mz, and since 
the degree of approximation which the asymptotic formulas yield depends 
upon the magnitude of p, the constant M; is in any specific case to be chosen 
such that representations which are uniformly suitable to the purposes in- 
tended are obtained. The secondary parameter is clearly confined to the fixed 
closed interval 


1 
(3.4) 


in which the lower boundary could in fact more strictly be replaced by 
M.A-"?, 

Let z be restricted for the discussion of this configuration to the infinite 
half-strip R, given by the formulas 


(3.5) R:: 


The extension of the solutions from this domain to the entire strip (13) may 
be accomplished by the use of the identities 


Ua(2) = Ua(— 2) — 2), 
us(z) = — 2) + 2ug(0)u.(— 2), 


and the odd and even characters of u,(z) and u,(z). Their extension to general 
values of z thereupon follows on the lines of §1.6. 

3.2. The variables s, and ~. The region R, corresponding to R, is the 
infinite half-strip 


-—ss"s—- 
(3.6) 2e 


Within this region xo?(s) has a single zero located on the axis of reals at the 
point 
(3.7) 
=—si 
{2(1 — 


= 
4 
| 
] 
| 
| 
| 
| 
| 
| 
| 
| 
| 
] 
i 
A 
j 
a 


658 R. E. LANGER (July 


Though s¢ depends upon a it is both bounded and bounded from zero for all 
admitted values of the parameters. 

The relation between s and the quantity ® maps R, upon a corresponding 
region Ry conformally except at the point si. The shape of Rg may be easily 
determined by observing the values of ® when s is either real or on the 
boundaries of R,. With R, thought of as cut along the axis of reals from the 
origin to sj these values for the upper half of R, are 


fors’ =O+and0Ss’S 5), 
sinh? gs’) 1/2 
Pu 
Oand0 S s” S x/(20), 
sin? are 
+ en f if 20 — i} ds"; 


0 


Oand sj Ss’, 


sinh? os’ 
f {20 — — i} ds’; 
80’ 


0 
for s’’ = r/(2c) andO S s’, 


ni cosh? os’ 
>= »(=)+ f if ds’. 
20 0 o? 


The map of the lower half of R, is obtainable by reflection from that of the 
upper half, since conjugate complex values of s lead to conjugate values of ®. 


Finally since 
sinh os 


2 
> —| | 
it follows that when |s| is sufficiently large 


sinh os 
{2(1 — } 1? 
(3.8) 
2 — 
~ —{2(1 — o”)}1/? sinh?§—; 
2 


the symbolism designating that the ratio of the members of either relation 
becomes 1 as |s|—>0. From the second relation it follows that when c is any 
sufficiently large constant the line s’=c maps upon a simple curve in Rg. 
The uniqueness of the correspondence between points of R, and R, is thereby 
assured.* Figure 2 indicates the map. 


* Cf. Osgood, W. F., Lehrbuch der “unktionentheorie, vol. 1, Leipzig, 1912, p. 377. 


‘4 
4 
\ 
ors! | 


1934] THE MATHIEU EQUATION 659 


The variables ® and é differ only by the real factor p, whence the domains 
R; and Rg differ only in scale. Figure 3 indicates the relation between R, 


Fic, 2 


and R;, each domain being divided into the sub-regions = defined in (7). 
The lines by which this sub-division is effected need not be determined with 


SS 


Cc 


NS 


R 
Fic. 3 


precision, for due to the overlapping of the regions any displacement of the 
curves which does not affect the character of the figure is immaterial. 

3.3. Fulfillment of the hypotheses. The zero of xo7(s) at s¢ is of the first 
order. Hence in the hypotheses of §1.2 the values y=1, 7=wi:=k=0 are to 


E | 
iy, 
; 
D G| E \ 
; 
a 
D 
4 
D Cc 
\ 
Re 
4 
4 
3 | 
| 
| | | D, | 
' i 
=o 4 =o 
or \ | or 
| 
i 
¥ i 
} 
H 
| 


660 R. E. LANGER [July 


be used. With the value of ¢ given by the formula (3.3) it is found that the 


functions (6) are in the present instance 
6 5(2—a7) 


1 5 2 


Let the region R, be divided into three parts by the relations 
(a) |s—si| <6, 
(b) 6<|s—si/|<H, 
(c) H<|s—sé|, 
with the constants 6 and H as specified in the following. It is to be shown 
that in each of these parts the hypotheses of §1.2 are uniformly fulfilled. 

To begin with let H be chosen so large that in the part (c) the formulas 
(3.8) may be applied. Then it is a matter of simple computation to show that 


in this part of R, the hypotheses (i) and (iii) are uniformly fulfilled. 
Next let 5 be chosen so small that within the part (a), |¢?| <4 for all 


admitted values of o. Then 


(1) 


2 
with O(1) designating functions which are uniformly bounded. Since 


o @ 


whereas from the formula (3.3) 


¢? { ( 
ame 1 2 1 


it is found that 


¢* 39? 
3(2 — §(2 — 02) wou}. 


With this evaluation it is seen directly that in the part (a) the functions (3.9) 


are uniformly bounded. 
Lastly in the part (b) the formula (3.3) may be written 


sinh — s¢ ) 4 sinh? o(s — 


2 


xe(s) = 242 — 


20 


a 
j 


1934] THE MATHIEU EQUATION 


It is evident from this that both 


X(s) and f 


are non-vanishing and continuous as functions of the two variables (s—sy , c) 
in the closed region determined by (b) and (3.4). Accordingly, they are 
bounded uniformly in o and the hypothesis (i) is uniformly fulfilled. Clearly 
also the functions (3.9) are uniformly bounded and so the requirements of 
§1.2 upon the differential equation are uniformly met. 

3.4. The forms of the solutions. Since ¢7(s) has a simple zero in R, the 
asymptotic representation of any solution of the differential equation is 
subject to the Stokes’ phenomenon, and » being 1 the formulas of §1.4 are 
applicable. From Figure 3 it is seen that the origin z=0 may be regarded as 
lying in the sub-region =». Hence with 4 = —1 and the subscript a replaced 
by 1 the formulas (11a) and (11b) yield the representations of the solutions 
uo(z) and u,(z). It may be observed from Figure 3, however, that the value 
£, which corresponds to z=0 (at C, in the figure) is such that i, is real and 
negative, so that any quantity multiplied by e* is asymptotically negligible 
in comparison with the same multiplied by e~*. With the omission of such 
negligible terms the formulas obtained are the following: 

When z isin and |£| 


1 
Uo(z) = ) 4 
2 2 0,1 0,2 
(3.10) p oid 


with coefficients 


661 
| 
| 

| 

4 l —1 0 1 

|— | deft] | [1] | 
| | eft] | | 
[1] ie~*#s[1] ie~*#:[1] 


662 R. E. LANGER 
When 


wit 13 13 
£1 1 
(3.10b) 


1/2 


In the original variables 


po 


= {Qcos 2z — A}#/?, = 


= A — 


Yo 
if {2 cos 22 — if {A — Q cosh 2y} 
yo 0 


with yo=4 cosh! A/Q. Further, it may be noted that since the values of @ 
on the lines AC and AC, in Figure 3 differ only in sign, therefore 


in 
= ods == ff $dz, 
o A o A 


— &), in BOY, 


0 (E+ &), in 


are also valid provided the entire path of integration is taken in each case in 
the sub-region indicated. 

The formulas (11a), (11b) may likewise be drawn upon to give the repre- 
sentations of the solutions u,.(z), us(z). If the point corresponding to z = 7/2 is 
Sa, the subscript a is to be replaced by 2, and since £ (at D, in Figure 3) lies 
in the region =, his again to be taken as —1. With the omission of asymp- 
totically negligible terms the formulas obtained are the following: 

When zis in =“, and || =N, 


whence the formulas 


1 o2 
Ua(z) = (=) + 


1 
us(z) = (2) { + Kot , 


with coefficients 


(3.11) 


THE MATHIEU EQUATION 


1 0 


When |é| 


Tio? 1/2 
—ik, 3 
ta(2) -( = —) + 
(3.11b) 


1/2 


ua(z) = ( 


Again 


& = & — f {A — Q cos 2x} 
0 


Figure 3 shows that the segments —7/2<x<0 and 0Sx<7/2 of the 
axis of reals lie respectively in the sub-regions = and =‘-», The formulas 
above appropriate to these regions accordingly yield the descriptions of the 
solutions when z is real. It is found that these formulas are precisely those 
given in (2.8b) and (2.9b), though it should be noted that with the difference 
in the definition of the parameter p the significance of symbol [ _] is slightly 
different in this chapter from that in the preceding one. 

The pairs of solutions (3.10) and (3.11) have each the defect that in the 
region about the upper part of the axis of imaginaries the component solu- 
tions are asymptotically multiples of each other. The pair of solutions 
u_1,1, U-1,2 given in (9) would be one not subject to this particular shortcom- 
ing. 

3.5. The solutions of the associated Mathieu equation. If z lies in any of 
the domains indicated in Figure 4, the point iz lies in the corresponding 


1934] ee 663 
1 
a 


664 R. E. LANGER [July 


sub-region of R, as shown in Figure 3. In accordance with (12) the represen- 
tations of iv,(z) and 2,(z) are therefore obtainable in any one of the regions 


Fic. 4 


indicated by the mere substitution in the associated formulas (3.10) of 6 and 
£ in place of ¢ and &, the former being the same functions of iz as the latter 
are of z. Explicitly 


{@ cosh — A}1/2, 
o 


t= f {Q cosh 22 — A}*2dz, xo = 4 cosh! A/Q. 
zo 


In particular, for real values of the variable the formulas so obtained are 
the following: 
For 


Jo( x) = inh {a-2 sh 2 | 
(3. 12a) 


la-o cosh 2x 


For xSao, |E| <N, 


(2e)-1/2| | 


3.12b) v(x) = 
( b) v(x) {(A—2)(A—2 cosh 2x) }1/4 


| )]. 


For xo <x, 


2 \ 
\ of =o 
= 
‘ 
0 
/ 
=-v or =e 5 
ni / 
$4 
4 
= 
| 
ag 


1934] THE MATHIEU EQUATION 


For x<xo, |E| =N, 


[1 Jet 


(3.12d) = cosh 22—A)} cos | cosh | 
For the x ranges concerned in the cases (b), (c) and (d) the representation of 
v(x) has been omitted since it is found to differ in appearance from that of 
only in that the factor (A— Q)-"/4 is replaced by (A— Q)!/*. For the 

range in case (b) the value of £ is imaginary, i.e., =e-**#/? || , and the rela- 
tion 
1/27 


K13(| |) 


was used. 

As already noted in §3.4, a pair of solutions which unlike those above are 

not asymptotically multiples of each other for large real values of z would be 

that obtainable in the manner used above from the functions u_,,;(z) described 
in (9). 

3.6. The characteristic values and exponent. The forms of both the ex- 
ponent u and the characteristic equations were found in chapter 2 to be de- 
termined by the formulas (2.9b). Since these formulas, except for the inter- 
pretation of the symbol [_], remain valid for the configuration at present 
under discussion, the deductions of §2.5 and §2.4 require but slight modifica- 
tion to apply to the case in hand. The characteristic exponent is thus given 
by the formula 


i f Al/2 
(3.13) =— cos"! cos f 2{A — Qcos 2x} + o( 
T 0 A-Q 


The order of the final term within the bracket evidently increases with Q, 
from which it is evident that the domain of parameter values for which u 
is real, i.e., for which there are unstable solutions, increases in extent as the 
upper end of the range of values 2 admitted in the configuration of the pres- 
ent chapter is approached. 

The characteristic values S,(Q) and C,() are each the root of an equa- 
tion of the form (2.12) which in the present instance is more explicitly 


All2 nt 
3.14 f A — 2x} o( ) =—- 
, 0 + A-Q 2 


The lower end of the 2 range joins with that of the configuration II, and for 
such parameter values the formulas (2.13) are again valid as was to be ex- 


665 
Bi 


666 R. E. LANGER [July 


pected. To obtain formulas valid near the upper end of the range the follow- 
ing process may be used. 
Let be. defined by the relation 


(3.15) A — = 25/22,Q12, 
and in the integral of (3.14) replace « by +/2—¢. Then the equation becomes 


ur 
A 2)!°G(h?, h? =—y, 


with G the elliptic integral of (26) and 


h? = (1 +—) 
Q 


For the larger of the admitted values of © the ratio k,/ Q is of the order of 
A-"? and h? is therefore nearly 1. With the use of the formula (26b) the 
equation may accordingly, be written 


(29)1/2 — log 


ky 
(322)1/2 


(3. 14b) 


Recalling (3.15), therefore, it follows that 
Sn(Q) = 2+ 
(3.16) (Q) + 
C,(Q) = + 


with each &,(m) a root of an equation of the form (3.14b). 


CHAPTER 4 


THE CONFIGURATION IV 


4.1. The differential equation. Let the configuration designated as IV in 
Figure 1 be defined as that comprising the parameter values (2, A) in which 
both are large and 


M, being the constant in the relation (3.1). Then the substitutions 


A-Q 
= 1/2 = 
(4.2) p = (320)', (320) 


determine p as a large parameter, while the range of values given to a is 


+ +0 ( nr 
q 


1934] THE MATHIEU EQUATION 667 


bounded. The differential equation (1) takes the form (3) with the coefficients 


=sins, 
(4.3) Xo 3 


x 
in virtue of which the functions (5) are in this case explicitly 
k= — i, 


n(s) = 2io tan rt 


= sin cos — +— sec —?, 
1 cos 2 sec 2 
s 
= — sin?— — — log cos? —- 
2 p 2 
Let R, be chosen as the strip (13). Then in the region R, the coefficient 
xo” has a single zero located at the origin and of the second order. It must be 
shown that with the appropriate values s)=0, y =2 the requirements of §1.2 
are uniformly fulfilled. The hypotheses (i) and (ii) offer no difficulty in this 
respect, while the consideration of the functions (6) and the hypothesis (iii) 
may be made as follows. 
The relation 


defines g, in terms of which 


er~—1 
2q p 


2¢ 
2et — 2 — get + — (1 — “ 
p 


= — ot + 


while the various members of the formula 


3 2¢’ 
¢ /\o 


are found to be 


AY 
q e* = cos? — 
2 


R. E. LANGER 


o 
1 e~*a 
p 


p 


20 
2¢’ p 
— = cot— — tan —{———- 
2 2 
1+—oe 
p 
—-1+ 
—1) 


It is to be observed now that q vanishes with s, that | e*| >4 in R,, and that 
the ratio a/p will be uniformly as small as desired if 2 is restricted to remain 
sufficiently large. It is consequently seen that the brace in the formula for 
is uniformly bounded from zero and hence that both w(@) and w; are uni- 
formly bounded in any finite part of R,. Finally, when | s| is great the asymp- 
totic formulas 

—1 


ef, 
2 16 


~ 207g, ds ~ + idq 


are readily checked and in virtue of them the uniform fulfillment of the 
hypothesis (iii) becomes evident. 

4.2. The solutions u,(z) and u,(z). The variables ® and é differ only by 
the real factor p, while s and z are identical. Since the values of ® on the bound- 
aries of R, are as follows: 
for s’=0, 


” ” 
= — —sinh?— — — log cosh? —; 
2 2 p 2 


for s’ =7/2, 
1 cosh (sinh s”  o 
= {— — — log + if + — tan“! (sinh 
4 p 2 4 p 


the map of R, upon R; is as indicated in Figure 5. The figure shows also the 
partition of these regions intv the sub-regions =“ defined in (7). 


668 P| [July 
d 
4 
2 
4 
4 


1934] THE MATHIEU EQUATION 669 


The representation of a pair of solutions (s), w2(s) which are determined 
by the initial values 


ip\1/2 1/4 
u(0) = 0, ui =(*) (1 +=) 
4 p 
2a —1/4 
= (1 +=) =0 


is known,* and is expressible in terms of the confluent hypergeometric func- 


D D 


=~ or 


| 


Fic. 5 


tions customarily designated by M;,:.} With the functions 2; defined by the 
formulas 


it is found thus that the principal solutions, which are evidently mere multi- 
ples of u; and mu, are the following: 
For 


2\12 
u(z) = (=) o)], 
(4. 6a) 


uae) = (+) 


* Paper Ls. See, however, the footnote on p. 646 regarding the differences of notation. 
t Cf. Whittaker and Watson, loc. cit., chapter XVI. 


c 
3 
| =” or = 
/ | 
Ole =o A 
= 
>) 
| 


670 R. E. LANGER [July 


On the other hand, when z is not in the neighborhood of the origin the formu- 
las are the following*: 
For |£| =N, and z in 


| (27) 

ia) 

| (2 


3 — ig) 


1/2 


1e29F 


for use in these formulas it is permissible to write in terms of the original 
variables 


@= [t]sinz, = (22)[1](1 — cos2), 


y io 1 1/4 
eit e (ie /4)(1—cos 2) = (--=;) [1]. 
1+ cosz 1 + cosz 


When z is real the same is true of ¢, V and , and the last of these is posi- 
tive. For such values the functions 2; of (4.5) are real, and the formulas 
(4.6a) are therefore directly real. From Figure 5 it is seen that such values 
of z lie in =, whence the appropriate formulas (4.6b) reduce to 


(4.7) 


* The symbol [], is used in the ser.se that [Q], denotes a quantity which differs from Q by terms 
of the order of (log p)/p and terms of the order of N-. 


u,(z) = (=) (ip)-3/4 
+ 
(4. 6b) 
(2) ( : 
= 
2¢ : 
4 
with coefficients 
l —1 0 1 
0,2 
1 
e,l 
Ro § 
3 
a 
| 


THE MATHIEU EQUATION 


(x) = (= — | i log 2§ — +=] 
Uo(x 


1 


x p r 10g v2 8 


2 
The symbols I’; and y; designate the real values determined by the formulas 
(4.8) + ic) =Tyetin, (2 + io) = 


in which the left-hand members are gamma functions. 

4.3. The solutions u,(z) and u(z). The solutions of the equation (3) espe- 
cially associated with the sub-region = which by Figure 5 contains the 
point z= 7/2, are those described by the following formulas: 

For |é| and s in 


(49a) mo,s(s) = BO + BO 


with coefficients 
—1 


[1]; 


— | 
— io) — io) 


4 


| | 
+ io) + io) 


[1], [1], 


‘4 


1/2 2 


wi/4 


-i 
1/2 
[an 
) + i(, o) 


1934] 671 
(4. 6d) 
0 | 1 
Be = [1h (1) 
(1) 
2,1 
| [1]: 
| For |é| 
(4. 9c) 
3 


672 R. E. LANGER [July 


The substitution into the formulas (8b) is simple, the Wronskian having 
the value W =(ip)"?[1],, and if the subscript 2 is used to designate evalua- 
tions at z=7/2, it is thus found that we have the following: 

For and z in 2, 


=" 
Ua(Z) € [ 


1/2 é ig 


2 
where 


For | ¢| <N, 
: 2 cos Ez sin © 
sin 


(4.10b) 
ug(z) = 


with 
p 


1 


(4.11) 


log p — -=]. 


For real values of z the formulas (4.10b) are directly real, while the ap- 
propriate formulas from (4.10a) reduce to 


{2 x 
Ph sin |= cos x — log tan | 
4 2 


(4.10c) 


[1 p x 
us(x) = cos | — cos x — 2a log tan—]. 
4 


4.4. The solutions of the associated Mathieu equation. The representa- 
tion of the solutions iv,(z) and 2,(z), of the “associated” differential equation 


1934] THE MATHIEU EQUATION 673 


(2), are obtainable, as is now familiar, from the formulas of §4.2 by the sub- 
stitution in place of o,& and W of the respective functions of iz, which may be 
designated by ¢, — and V. Explicitly the evaluations 


= | —| sinh z, 


€ = (20)1/2[1](1 — cosh 2), 
( 2 
14) (cosh) | 
1 + cosh z 


ail 4 1/4 
1 + cosh z 


may be used. The sub-regions of the z plane in which the respective formulas 
so derived are valid are as is shown in Figure 6. 


=) - 


or / \ =” or 


Fic. 6 


In particular, when z is real and positive the forms deduced from (4.6b) 
for 2“ reduce to the following: 
for 2, 


v(x) =p —} ¢ sin} 
r 8 


141 1 


v(x) = [=| cos | | log 2| =|, 
8 


On the other hand, when x is small, i.e., 


for |—| <N, 


(4.12a) 


v(x) = (=) |€|,0)], 


(4.12b) 
- (=) v[te(— |= |,0)]. 


‘i 

/ =o) 

_™ 

2 

| 


674 R. E. LANGER [July 


The functions within the brackets may be shown to be explicitly real as they 
should be. 

4.5. The characteristic values. The values (4.6d) substituted into the 
characteristic equations (17) and (18) give to the latter the forms 


n 
(20)1/2 4 log (320) — + O(2-1/? log 2) = = 


for an odd Mathieu function, 
(4.13) 


n 
(29)? + = log (322) — v2 — + log 2) = = 


for an even Mathieu function. 


These equations may be given a somewhat more detailed form when a is 
near either the one or the other extreme or the middle of its admitted range 
of values. The indices of the characteristic values which satisfy the equations 
with a specific integer 7 on the right may also be determined as will be 
shown. 

The theory of the gamma function supplies, in particular when c¢, =3/4 
and ¢,=1/4, the formulas* 


+ — tan“ ), 
(4.14) log '(c; + io) = 3 log 2x + (c; — 3} + io) log (c; + io) 


— (c; + io) +0(7-), 
— = wesc xt, 


and from the first of these it is readily seen that with Q fixed the left members 
of the equations (4.13) vary monotonically with o so that the roots for any 
integer m are unique. 

When g is near the upper end of its admitted range of values, it is large 
and positive, and the second of the formulas (4.14) gives the evaluations 


vj; = logo — + (2c; — 1) (x/4) + O(1/o). 


Both the equations (4.13) thus become 


(4.13a) (20)"2 — © log + + log 2) +0( ) 
2 322 A-Q 2 
* Cf. Nielsen, N., Handbuch der Theorie der Gammafunktion, Leipzig, 1906, p. 23 and pp. 94 
and 209. 


} 


1934] THE MATHIEU EQUATION 675 


which is, therefore, the form of the characteristic equations when 0 is near 
the lower end of the range of values admitted for it in the present configura- 
tion. Since for these values the configurations of the present and the preced- 
ing chapter abut, the indices of the characteristic values concerned may 
be determined by a comparison of the equations (4.13a) and (3.16), : in 
the latter having been defined precisely as o is in the former. With a given 
value of m the roots of the equations (4.13) are thus seen to be precisely 
S,(Q) and C,(Q) respectively. 

Near the middle of its range o is small, and the left members of the equa- 
tions (4.13) are essentially represented by the early terms of their expansions 
in powers of o. Thus the equations become 


1 
{(» + — (29)!/? + O(2-"/? log 


+ +0 =0 


the values of m concerned being such as make the initial term small. The 
formulas which are valid in this case, i.e., when A and Qare nearly equal, are 
thus 
(n — Bx — 
log (322) — (2) 
(n+ — 
log (322) — 21’(4)/T (4) 


= (320144 +00), 


(4.13b) 


In particular, the values of 2 for which o =0 is a root, i.e., for which there 
is a characteristic value equal to Q, are found to be as follows: 


1\7 log 
If S,(2) = Q, then (2Q)!/? = (x ~ = +0( ), 
4/2 n 


1\7 log n\* 
If C,(2) = Q, then (2)!/? = (n +0( ). 


n 


(4.15) 


* These values were considered by Goldstein, S., in A note on certain approximate solutions of 
linear differential equations, etc., Proceedings of the London Mathematical Society, (2), vol. 28 (1928), 
p. 87, where the results are stated in the following form: 

cos 

Hf Sa(Q)=9, then ) sin 1)", 
242 cos (82)"2~(—1)", 


If C,(Q2) = Q, then 212 sin (89) ¥2~(—1). 


F 
< 
| 
i | 
| 
= 


676 R. E. LANGER [July 


Finally near the lower end of its permitted range @ is large but negative, 
and the second of formulas (4.14) gives 


1 
4 | o| 


The characteristic equations (4.13) accordingly become respectively 


+ — log —— — “1/2 lo =—) 

for the characteristic value S,(Q) ; 
(4.13c) 
322 


Que 
+e- ry + O(2-"/? log Q) + o(- ) 


o 
20)'2 + — |] 
(22) og 


for the characteristic value C,(Q). 
These are, therefore, the forms which are valid when Q is near the upper end 
of its permitted range of values, or, in other words, when A is near the lower 
end of its possible range. 
4.6. The characteristic exponent. The formulas (4.6d) and (4.10b) yield 
for the evaluation of 9 in (25b) 


= | cos E, cos — 1, 
1 


142 


where €, and €, are as defined in (4.11). The third of the formulas (4.14) may 


be made to give further 
cosh 
\ 2 
whence 


(4.16) © = 2{1 + e-**}1/2[1], cos Ez cos E, — 1, 


and the characteristic exponent is obtainable from the appropriate formula 
(25a). 

The (Q, A) sub-regions of the domain IV of Figure 1 which comprise 
parameter values for which the differential equation has stable solutions are 
those for which the value of © is less than unity. It is evident from the 
formula (4.16) that these sub-regions become more and more attenuated as ¢ 
decreases, i.e., as the right-hand boundary of the configuration IV is ap- 
proached. 


j 


THE MATHIEU EQUATION 


CHAPTER 5 
THE CONFIGURATION V 


5.1. Preliminaries. Abutting the configuration of the preceding chapter is 
that denoted by V in Figure 1, in which 0 is taken to be large and 


(5.1) 0<A<2- 


In this case the substitutions 


(5.2) 
= s=— 
2 


reduce the differential equation (1) to the form (3) with 
(5.3) 


The parameter p is bounded below by the constant M; while o? is confined 
to the fixed closed range 0 <o*<1, its smallest possible value being in fact 
M,0-"?, 

With the strip (13) chosen as R,, the region R, is 


(5.4) R,: 


and within this xo? admits just one zero which is simple and is located on the 
axis of reals at the point 


The position of so’ varies with o but is restricted to the fixed interval (2-"?, 
1/4). 

If R, is thought of as cut along the axis of reals from s=0 to s=5y’, the 
values of ® on its boundaries are the following: 

For s’’ =0+,0<s'<sy’, 


2 sin? os’) 1/2 
@ = et i, 1 — ————->__ ds’. 


For s’ =0, s’’=0, 


(2sinh? os” 1/2 
= (0) + f + it ds’. 


0 


j 1934] 677 
q 
x = 0. 
Oss’ s—, 4 
So = —sin-!—- 
4 
| 


678 R. E. LANGER 
For s’ =2/(2a), s’’=0, 


(2 cosh? os” 1/2 
= —)+ f ———- - 1 
20, 0 


The maps of R, upon Ry, and hence of R, upon R;, are thus revealed, the 
latter being as indicated in Figure 7: 


B 


Cc 
C 


/ =v or =o 


Fic. 7 


5.2. The hypotheses. The discussion by which the uniform fulfillment of 
the requirements of §1.2 by the present differential equation may be estab- 
lished, will be omitted as to detail inasmuch as it proceeds almost entirely 
like that of §3.3. In virtue of the values (5.3) the functions (6) are in this 
case explicitly 

| = 0. 


When |s—so’| is great the formulas 
sin os 23/2 os 


o~ 212 ~ — sin? — 
2 


may be used, while for intermediate values the first of the formulas (5.3) may 
be written 


D E E 
\ 
\ 
\ 
| D Cc 
Ai 
R, 


THE MATHIEU EQUATION 


sin? o(s — s¢ 


sin — s¢ ) 


o2 
For small values of | s—so’| it may be shown that 


{1 _ — 


3(2 — o?)1/2 5(2 — a?) 


eo}, 


and with these formulas at hand the arguments of §3.3 may be paralleled. 
5.3. The solutions relative to z=0. The point z=0 may, as is seen from 
Figure 7, be regarded as lying in the sub-region =“. Moreover, the zero of 
xo® being simple the formulas of §1.4 are applicable, with s=1 as the appro- 
priate value. The formulas (11b) and (11a) thus become, in the manner now 
familiar, the following: 
When |¢| <N, 


1/2 


u(z) = { 


(5.5a) 


1/2 


and when z lies in Z, and |£| =>N, 


2 


(5. 5b) 


with coefficients 


1934] 679 | 
1 ot 
Uo(2) {Ke P+ Ke 
l —1 0 1 
K® ett: [1] ett: [1] — iets [1] 
K® | — ie*ts[1] ie*s[1] 
K® — ie~its[1] et:[1] et,[1] 


680 R. E. LANGER 


The symbols involved would have the evaluations 


{A — Qcos 22} 1/2, POL _ exit — 


t= f {A — Qcos 22} 1/2dz, xo = 3 cos? A/Q, 


0 


f {Q cos 2x — A}*/2dx. 
0 


When z is real and less than x» the relation ¢=|£| e*“/? is valid and hence 


1/2 


J + J 1/3(€) é | ). 


The formulas given in (5.5) thus reduce when the variable is real to the follow- 
ing: 
When 0 <x and 


re: 


a 
u.(x) = [1] cosh f {2 cos 2x — 


Q cos 2x — 
When x <x», and |£| <N, 


| & 


— A)(Q cos 2x — A)} 1/4 


[| )], 


u(x) = 
(5.6b) 


with |¢| = f "{9 cos 2x — A} 


When <x, and <N, 


{ (Q A)(A — Qcos 2x) } an) + 


(5.6c) u(x) = 


When x) <x<7/2, and || 
[1 
— A)(A — 2x) 


=z 
sin |=+ f {A— 22} 
z0 


u(x) = 


[July 
(5. ; 
| 


1934] THE MATHIEU EQUATION 681 


In the cases (b), (c) and (d) the representation of u,(x) may be formally 
obtained from that of u(x) by replacing the factor (Q—A)-"4 by (Q—A)"4. 
5.4. The solutions relative toz=7/2. The formulas (11) with the sub- 
scripts @ replaced by 2, where the latter denote values corresponding to 
z=7/2, may be made to yield also the solutions u.(z) and us(z). Since the 
point z=7/2 lies in the sub-region = the value #=0 is appropriate and the 
formulas obtained are the following: 
When |é| <N, 


Iwo? \1/2 
Ua(z) = cos - = 


— sin + “) 


(5.7a) 


+ cos + [é . 


When zis in =”, and |é| 


1 o2 
Ua(z) = ) { KMeit + 
(5. 7b) 
) + 


l -1 0 1 

= — ie~*:[1] — cos (& 

Ke, cos (& *)] ie‘*2[1] ie®*2[1] 

(5.7¢) 


: 
F with coefficients 
Ke 
4 


682 R. E. LANGER 


In terms of the original variables 


= {a+ 


& = f {A — Qcos 2x} 


0 


The forms obtained for real values of the variable are the following: 
When 0<x<2p, and |£| 


= + A)(@cos 2x — A)} 1/4 


exp cos 2x — 


(5.8a) 


When x So, and |é| 


1/6 
a(x) = §| “Yl )] 


{ (Q+A)(2 cos 2x—A) } 1/4 


Qcos2x—A 


(5.8b) 


When xo <~, and || <N, 


(2m /3)1/2E1/6 ™\ 


=) sin (: =) (é)] 


+cos =) it . 


[July 
| 
(5.8c) 


1934] THE MATHIEU EQUATION 


When x)<x<7/2, and |£| =N, 


x“) = in| {4 25} dx, 


{(Q+A)(A — 2x) } /2 


2 + A 1/4 z 
a [1] cos {A — Qcos 2x} |, 
A — Qcos 2x 


5.5. The solutions of the associated equation. The positive axis of imagi- 
naries in Figure 7 lies in the sub-region =. The formulas (5.5) appropriate 
to this region are to be used, therefore, in obtaining the solutions of the equa- 
tion (2) for real values of the variable by the substitutions (12). The formulas 
thus found are 


{(@ — A)(Qcosh 2x — A)} 1/4 | Jf vas], 


= cos {@ cosh 2x — 


Qcosh 2% — A 


(5.9) 


5.6. The characteristic values and exponent. The forms (5.6d) show that 
the characteristic values for both even and odd Mathieu functions are in this 
case determined by equations 


(5.10a) [= + f {A — Qcos 22} | 


the proper correlation of the indices of the roots with the integer m being duly 
regarded. 

If k, is defined in terms of A and Q by the same formula as is the o of 
chapter 4, i.e., by (3.15), the substitutions 


4k, 


cosx = hsin¢, 


reduce the equation (5.10a) to the form 


5.10b 22)'/*h?G(1, h?) +O 
(5.10b) + +0(——) 
in which G is the elliptic integral of (26). In the range of transition from the 
configuration of chapter 4 to that of the present chapter, &, is negative and 
h? accordingly little less than unity. The evaluation of (5.10b) to the form 


+ (29)? 4 ky 322 + by + Jog 2) +0( Qu2 ) 
— — log og =— 


683 
(5.8) 


684 R. E. LANGER [July 


may, therefore, be obtained by the use of (26b), and a comparison of the 
result with the equations (4.13c) shows that the characteristic values which 
occur as the roots of equations representable by (5.10a) are respectively 
S,(Q) and C,_;(Q). In other words the characteristic equations are as follows: 


Qi/2 1 T 


for the characteristic value S,(Q); 


Qi/2 
f {A — Qcos 2x} +0( )=( 
z 2-A 


for the characteristic value C,(Q). 
In the consideration of the characteristic exponent the formulas (5.6d) 
and (5.8a) in conjunction with (25b) are found to lead to the evaluation 


(5.12) © = [cos 2] — 1, 


and with this the value of yu is given by the formula (25a). Since the right- 
hand member of (5.12) can be exceeded by unity only when the cosine is 
very small, it is evident that the unstable solutions greatly predominate in 
the present configuration. 

An evaluation of the several elliptic integrals involved may be made to 
show that the transition from the formula (4.16) to (5.12) is a continuous one. 


CHAPTER 6 
THE CONFIGURATION VI 


6.1. Remarks. The configuration designated by VI in Figure 1 is to be 
that in which A is negative and 


(6.1) — M012} <a <0. 


It clearly differs from that of the preceding chapter only in the sign of A. The 
distinction between the two configurations is indeed largely an artificial one, 
entered into primarily for the purpose of utilizing the discussion of §5.2 with- 
out modification when parameter values admitted by (6.1) are concerned. 
For in this latter case the substitutions 

6.2 _2+4 3ri/2 

(6.2) p= fait 

transform the differential equation (1) into the form (3) with precisely the 
coefficients (5.3), with o restricted precisely as in the earlier case. The de- 


(5.11) 

2) 2° 3 


1934] THE MATHIEU EQUATION 685 


ductions of §5.2, therefore, serve again to show that the requirements of the 
general theory are uniformly fulfilled. 

By their definitions the intermediate variables s, ¢, and ®, and the para- 
meters o and p, differ from the corresponding quantities in chapter 5. The 
ultimate variables z and & are, however, found to have the same relation to 
each other, so that Figure 7 continues to remain valid in the present con- 
figuration. It is found as a consequence that the various formulas deduced 
in §5.3, §5.4, and §5.5 apply also in the present instance, provided they are 
expressed entirely in terms of the original variables z, A, and Q. 

6.2. The characteristic values and exponent. With the prevailing forms 
of the solutions exactly those of chapter 5 the characteristic equations of 
course remain of the form (5.11). It is of interest, however, to obtain from 
these equations more explicit formulas which are valid near the lower end 
of the admitted range of values for A. For such values h*, which may be 
written 

a-|4| 
20 


h2 


is small of the order of Q-”?, and in the equation (5.10b) the evaluation given 
by (26a) is appropriate. The equation thus becomes 


h? 
+ + = 2n—1. 


It is evident that the integers m concerned are those of a bounded set, the 
equation being expressible for such m in the form 


2+ A = (2m — 1)(22)"? + O(1). 


Inasmuch as the characteristic equations represented by (5.10b) were found 
to be those for S,(2) and C,_1(Q), it follows that for the algebraically smaller 
of the presently admitted values of A the characteristic values are described 
by formulas 


(6.3) Sa(Q) = — 2+ (2m — 1)(2)"? + O(1), 


C,(2) = — 2+ (2m + 1)(20)"/? + O(1). 
The characteristic exponent is again given by (25a) and (5.12). 


CHAPTER 7 
THE CONFIGURATION VII 


7.1. The transformed differential equation. When A is large and negative 
and 


(7.1) — < 


4 
j 
=) 


686 R. E. LANGER [July 


the configuration is that designated by VII, Figure 1. In this case the sub- 
stitutions 
Q2+A 


= 1/2p—7ri/2 = 
(7.2) p = (320)'/*e o G20)" 


2 


bring the differential equation (1) into the form (3) with 


xo(s,o) =} sins, 


xi = to. 


(7.3) 


The transformed equation thus differs from that obtained in chapter 4 only 
to the extent that o is replaced by io. The formulas for ¢ and ® given in (4.4) 
are adaptable to the present case by the substitution of —a/|p| in place of 
a/p, a change which is easily seen to affect in no way the validity of the argu- 
ments of §4.1. That the differential equation in the present instance uni- 
formly satisfies the hypotheses of §1.2 may, therefore, be accepted without 
further consideration. 

The regions R, and Ry which correspond to the strip (13) are, both as to 
outline and relative orientation, precisely like the z and & regions shown in 
Figure 5. Since under the relations (7.2) the region R, is a reflection of R, in 
the point s=7/4,-whereas R; is obtainable from Ry by a rotation besides 
the change of scale, the figure which relates the ultimate regions R, and R; for 
the chapter at hand is as indicated in Figure 8. The division of these regions 
into the sub-regions =“ is also as shown. 


D 


| 
D 
\ =v 
: 
= 10) SB _ 
= A 
/ 
C, } 
/ R, 
Cc, R. dD, 
Fic. 8 ; 


1934] THE MATHIEU EQUATION 687 


7.2. The solutions. The origin z=0 corresponds to s;= 7/2 and lies in the 
sub-region =, The principal solutions relative to this point may accordingly 
be deduced by the substitution of the values (4.9) (with o replaced by ic) 
into the formulas (8b), the subscript a being taken as 1. To this extent the 
process coincides with that by which the forms (4.10) were deduced. In the 
present instance, however, certain terms may be dropped from the resulting 
formulas for, as may be seen from Figure 8, the quantity 7£, is real and posi- 
tive and e~* therefore asymptotically negligible in comparison with e*. It 
is found thus that the following formulas hold: 

When z is (anywhere) in R,, and || =, 


when | ¢| <N, 


(7.4a) 


Qeril4 


u(z) = (= TG — 


io) 
(é, ic) ] 


2ip 


1 


(7.4b) 


ren Qeril4 


u(z) = — 


c) [ats (E, ic) hi 


2 


1 


In these as in subsequent formulas any term is to be omitted if ¢ is such that 
the gamma function involved is infinite. 

The point z= 2/2 corresponds to s2=0 and the principal solutions relative 
to this point are therefore to be obtained precisely as were the solutions of 
§4.2. The formulas found are as follows: 

When |¢| <N, 


2 1/2 
(7.5a) 
1 1/2 
ug(z) (5) W -14(2€) | ; 


ht, 
| 


688 R. E. LANGER 


when |¢| and zis in 


() 


ha 
2it)7e 
1 
[ 


_ 


with coefficients given by the table 
l -1 


a 


ho e(e-1/4) i (o- 1/4) 
8 


In terms of the original variables 
cosz ( ) 
o= tan(— — 
4 322 + 2 
Q2+A4 1 + sin z 


it = (22)/2(1 — sin z) + log 


which permits the abbreviated relations 


[= “| 
4 


eit = e(lel/4) (i—sinz) etfs = (<) ewe. 
2 


The specialization of the various formulas to the case in which the vari- 
able is real may be made as usual, it being noted that then i= |e]. The 
representations which result are as follows: 


[July 
(7. 5b) 
ug(z) = (=) 


1934] THE MATHIEU EQUATION 


When |¢| 


sec x 


1/2 1+si 
U(x) = [1]: sinh sin x + log 
1 


1 — sinz 
1 


1 — sin 


(7.6a) 
u(x) = (sec x)!/2[1], cosh | can sin x + o log 
when |¢| <N, 

(2m) (20)? 


(322)°/2+3/8(1 + sin x)1/4 


2 


u(x) = 


1 
_ é| ’ 


(2m) (20)? 2 
(320)°/2-1/8(1 + sin x)!/4 ‘ra 
1 


(7. 6b) 


u(x) = 


the symbols M representing the confluent hypergeometric functions which 


occur in the formulas (4.5). 
When |é| =W, 


1 — sin x 


ate (27 sec + sin | 
1 


(32Q)°/2+3/8 
(2m sec /1+ sin x\" 
us(x) = ( ) 


(32Q)7/2+1/8 \1 — sin x 


(7.7a) 


@(22)"? (1-sinz) 
= a) 1 


When |¢| <N, 
—1 
(82)"2(1 + sin 
1 

(1 + sin x)!/4 

The solutions of the associated Mathieu equation, as obtained from the 
forms (7.4a) by the method of §1.5, are for real values of the variable repre- 
sented thus: 


(22 cosh x)!/2 


(7.7b) 
[] )]. 


ug(x) = 


V(x) = sin [(2Q)1/? sinh « — 2¢ tan-! (sinh x) 


(7.8) 


1 
v(x) = cos [(29)1/? sinh « — 2¢ tan-! (sinh x) 


689 

| 


690 R. E. LANGER [July 


7.3. The characteristic values and exponent. The characteristic equa- 
tions (17) and (18) may obviously if desired be rewritten in the forms 
ta(0) =0, ug (0) =0 and u'(0) =0, u.’(0) =0. It accordingly follows from the 
formulas (7.7a) that any characteristic value must be a root of the one or the 
other of the equations 


(7.9) 0, Eerie 0 


If o is not positive the relations (7.9) are manifestly impossible. Hence 
no characteristic values exist when A < —Q, a fact which may be simply con- 
cluded from a direct perusal of the differential equation. When ¢ is positive 
and of suitable magnitude, on the other hand, a relation (7.9) may be satis- 
fied in virtue of the gamma function becoming infinite. The appropriate 
values are clearly those for which 


whence the characteristic equations are found to be of the form 
= — 2+ (2n — — O(log 


This result when compared with the formulas (6.3), with which it must be in 
accord for suitable values of A and Q, shows that the characteristic values in 
the present configuration are given by formulas 
S,(Q) = — 2+ (2m — + O(log 2), 
C,(Q) = — 2+ (2n + 1)(20)*/? + O(log Q). 
Finally the computation of the characteristic exponent depends only upon 


the evaluation of the quantity © given in (25b). This evaluation from the 
forms (7.6b) and (7.7a) is found in the present case to be 


1 
(7.11) @ = | 
(322)" — o) — o) 


(7.10) 


CHAPTER 8 
THE CONFIGURATION VIII 


8.1. The change of variables. The configuration numbered VIII in Figure 
1 is to be defined as that in which A is negative and numerically large, while 


—1 
M, 


The substitutions for the transformation of the equation (1) are to be 


3 
= 


1934] THE MATHIEU EQUATION 


(8.2) ~ Tal’ s=+(=-:), 


in which case the resulting equation of the form (3) has precisely the coeffi- 
cients (3.3). The value of o is again confined to the range (3.4), and if the 
half-strip 


8.3 
(8.3) 0<y, 


is chosen as the domain of z, the corresponding region R, is precisely that of 
(3.6). In terms of s, therefore, the present equation coincides entirely with 
that of chapter 3. The hypotheses are in consequence uniformly fulfilled and 
Figure 2 again applies. The latter evidently leads in the present instance to 
Figure 9. 


D \ 


Fic. 9 


The extension of the representations which are to be obtained from R, 
into the entire strip (13) may be made directly by observing that u.(z) and 
ug(z) are respectively odd and even as functions of the variable (s—7/2), 
and by applying the identities (14a) and (14b) to the formulas for ~,(z) and 
u.(z). With this accomplished the further considerations of §1.6 are, of course, 
applicable. 

8.2. The solutions. The zero of x¢ is of the first order, and, as may be seen 
from Figure 9, both the points z=0 and z=7/2 lie in the sub-region 2. 
The formulas (11a) and (11b) may, therefore, be drawn upon, with 4=2, and 
lead to the formulas which follow. 


i 
691 
\ \ =” or =” / | 
| 
| 
or =o \ EC» or E@ Y 
t 
D D E DYZ 
2 


692 R. E. LANGER 


When z is in 2, and |£| =N, 


1 o2 
( ) { + K2-t¢-it} 
2 0,1 0,2 


1 
(2) 


1/2 
) + , 


with coefficients 


[Jette [1 


3)] 


[— 


(July 
(8.4a) 
1 o? 
(2) 2 (=—) + me 
(8. 5a) 
1 1/2 
l -1 0 1 2 
Ki : [1 [1]e-*s 
[— [— [— 1Jests [— j 
Ket | [— iets [— | 
l —1 0 1 2 
Ki! [iJests [1 
Ket | [— [— 1]e*s 
Kyi | | i 
et 


1934] THE MATHIEU EQUATION 


When |é| 


xo? 
(8. 4b) 


1/2 
u(z) = (=) EM 19(£) + 


12 


(=) ei cos( + *) ] 


+ sin - =) . 


For use in these formulas, 


— {| A| + cos 22}12, 


{A — cos 22} zo = 4 cos! A/Q, 
Vo 
f {| 4| — @cosh 2y} yo = 4 cosh | A| /2, 
0 
if {| A| + cos 2x} 
0 


It is found that for all real values of z on the interval (0, 3) the respective 
formulas are 


1) ree 
8.4¢c 


| a|+9 | 
1 h A Q 2x} 
ada) [1] cosh | f c0s 22} 


693 

(8. 5b) 

4 
poi 

m= — +a}, 3 
j 
_ 


R. E. LANGER [July 


[1] . 
8.5c) 


=4 cosh { | 4|+2 cos 


| A| +2 cos 2x 


The axis of imaginaries in Figure 9 likewise lies in the sub-region =, 
and the formulas for the solutions of the associated Mathieu equation are 
accordingly found to be 


(1] 
A|+2)(| 4|+@cosh 2x) } 1/4 22} az], 


(8.6) 


cos | A|+2 cosh 22} 


| cosh 2x 


8.3. The characteristic exponent. It is evident that the present configura- 
tion admits no characteristic values. The formulas (8.4c) and (8.5c) yield 


the evaluation 


= [2] f {|| + 2008 1, 
0 


and the formula (25a) accordingly gives the characteristic exponent in the 
form 


2 
(8.7) l= {| A] + Qcos 22} |. 


Clearly, the configuration is one of unstable solutions. 
CHAPTER 9 


THE CONFIGURATION IX 
9.1. The differential equation. In this final configuration to be considered, 
i.e., IX of Figure 1, the parameter A is large and negative while 


(9.1) 


The variable is to be restricted to any region in which a relation (2.4a) is 
fulfilled with some constant M;, and this constant is that which figures in 
(9.1). The substitutions 

Q 


694 
( 

| 
M, 


1934] THE MATHIEU EQUATION 695 


reduce the differential equation to the form (3) with the coefficients (2.3), the 
parameter o being confined as in (2.5). As was remarked in chapter 2, the 
Stokes’ phenomenon is absent and a single formula serves to describe a solu- 
tion over the entire strip given by (2.4a) and (2.4b). 

9.2. The solutions. The solutions (2.7) apply to the present differential 
equation (3) and may be used in the formulas (8b). It is found thus that 


(9. 3) 1 1/2 


1 1/2 
= (2) + 
with the symbols evaluated by the relations 


pd = if | A| + cos 22} 1/2, 


(9.4) 


por = pda = if | A] — 
f {| A| + cos 


— = f + 2 cos 22}1/%dz. 
0 


For real values of z these formulas are found to reduce precisely to the forms 
(8.4c) and (8.5c), while the forms which describe the solutions of the equation 
(2) are again found to be those of (8.6). As in the case of chapter 2 the con- 
clusion is possible that the symbols [ ] may be dropped from the formulas 
when 2=0. 

Lastly, the formula for the characteristic exponent is that already given 
in (8.7), and there are, of course, no characteristic values. 


UNIVERSITY OF WISCONSIN, 
Mapison, WIs. 


q 
by 
1 
| 
4 
2 
4 
4 
3 
4 
7 
4 
4 
j 
4 
S| 
ay 

4 
al 
x 


THE MOVING TRIHEDRON* 


BY 
E. P. LANE 


1. Introduction. A classical method of studying the metric differential ge- 
ometry of curves and surfaces in three-dimensional space is based upon the 
use of a moving trihedron. A trihedron of reference is associated with an 
ordinary point of the curve, or surface, under consideration, and then the 
point is allowed to vary over the whole, or a suitably restricted portion, 
thereof. The theory which thus originates is particularly powerful in solving 
problems concerning two curves, or two surfaces, whose points are in one-to- 
one correspondence. 

The theory of the moving trihedron in the study of curves, as outlined in 
§2 below, is due to Professor G. A. Bliss, who employed it effectively in his 
lectures on metric differential geometry at the University of Chicago. It was 
later also used by the author, to whom the extension to surfaces in the third 
and fourth sections is due. The essentially new feature of the treatment both 
for curves and for-surfaces is found in the recursion formulas upon which the 
discussion rests. As these do not seem to have appeared elsewhere in the 
literature, the following exposition is designed to exhibit them and deduce 
some of their consequences. 

2. Curves. The method of the moving trihedron as employed in the theory 
of curves will now be explained. Let us first of all establish an orthogonal 
cartesian coordinate system, which will be designated hereinafter as the fixed 
coordinate system. Referred to this system let the parametric equations of a 
real proper non-rectilinear analytic curve C be 


(1) x= x(s), y= 2 = 2(5), 


the parameter s being the arc length measured from some fixed point to the 
ordinary point P(x, y, z) of C. Further, let us consider a point Q whose co- 
ordinates X, Y, Z are given as functions of s by equations of the form 


(2) X = X(s), Y=Y(s), Z= Zs). 


If these three functions of s are all constant, the point Q is fixed, relative to 
the fixed coordinate system, when the point P varies on the curve C. This 
case will be excluded hereinafter, unless the contrary is indicated. Then as s 
varies, the point P moves along the curve C, and the point Q traces a curve 


* Presented to the Society, April 7, 1934; received by the editors March 3, 1934. 
696 


THE MOVING TRIHEDRON 697 


C, represented by the parametric equations (2). The points P, Q of the curves 
C, C; are in one-to-one correspondence, corresponding points being those as- 
sociated with the same value of the parameter s. 

At a point P of a curve C there is the local coordinate system with its origin 
at P, with the &-axis along the tangent, the 7-axis along the principal normal, 
and the ¢-axis along the binormal. The equations of transformation between 
the coordinates X, Y, Z of the point Q that corresponds to P and the local 
coordinates £, n, ¢ of Q are 


X=x+af+ln+N, 
(3) Y= y+ mn + wi, 
Z=2+7itnnt 


wherein a, 8, y are the direction cosines, in the fixed coordinate system, of 
the tangent; /, m, are those of the principal normal; and X, y, v those of 
the binormal, of the curve C at the point P. 

When the point P moves along the curve C, the local trihedron of C at P 
also moves, of course, and hence is appropriately called the moving trihedron 
of the curve C. The local coordinate system associated with the moving tri- 
hedron will be designated hereinafter as the moving coordinate system. The 
local coordinates &, n, ¢ of the point Q corresponding to P are themselves func- 
tions of s. If these functions are constants, the point Q is rigidly attached to 
the moving trihedron, so that the motion of Q relative to the moving} tri- 
hedron is zero. 

For the purpose of investigating the relations of the curves C, Cy, itis 
convenient to know the direction cosines of the tangent, principal normal, 
and binormal of C; referred to the moving trihedron of C. In order to calcu- 
late these, some analytical consequences of equations (3) will next be de- 
duced. If equations (3) are differentiated with respect to s, the results can 
be reduced, by means of the well known Frenet formulas, to 


= aA, + IB, + dCi, 
(4) Y’ = BA, + mB, + wi, 

Z! = + + (X’ = dX/ds,--+), 
wherein the coefficients A, B:, Ci are defined by the formulas 


p T 


and 1/p, 1/r are respectively the curvature and torsion of the curve C at 


i] 
‘ 
4 
| 
7 
7 
; 
7 
— | 
‘ 
t 
é 4 


698 E. P. LANE (July 


the point P. A second differentiation and reduction by the Frenet formulas 
lead to 
(6) X" = aA; + + 


and similar formulas for Y’’, Z’’, in which As, Bz, C, are defined by 
A C B. 

(7) 
p T T 


Repetition of the process gives 
(8) = aA3 + 1B; + AC3 
and similar formulas for Y’’’, Z’’’, in which A3, B;, Cs are defined by 


B A, C 
Bo Qo «Ded. 

p p T T 
An easy induction would yield 


(10) X™ = aA, +1B, + 


and similar formulas for Y‘", Z‘, in which the coefficients A,, B,, C, are 
given by the recursion formulas 


C.-1 Ba-1 


(11) 4. = — + 


p T T 


It should be observed that A,, B,, C, are the components in the moving 
coordinate system of that vector whose components in the fixed coordinate 
system are the derivatives X‘, Y‘, Z. Such a vector may be called a de- 
rivative vector. The components A,, B,, C, are not themselves actually deriva- 
tives, but they behave in some respects like derivatives. 

Some additional formulas will now be established. Let us make the con- 
vention that the arc length s; of the curve C,, measured from some fixed point 
thereon, shall be an increasing function of the arc length s of C. Then squar- 
ing and adding equations (4), and taking the positive square root, we find 


ds, 
(12) = = 


the summation being for cyclical permutations. Easy calculations now yield 
ds 1 


(13) 


| 
| 
ds? (>°A?)? 


1934] THE MOVING TRIHEDRON 699 


Formulas for higher derivatives of s with respect to s; could be calculated 
but will not be needed in what is to follow. 
Elementary calculus supplies the formulas 


ds \? d’s 
= x” =) + 
ds? ds; ds? 
ds? ds, ds? ds? 


and similar ones for the derivatives of Y, Z. The second of (14) can be reduced 
to 


(15) =aL+IM + dN, 
ds? 


where L, M, N are defined , 


(An AP As > A142), 


N= C2)>_A? — Ci 


Direct calculation results in 
ds, ds? ds? ds, 
where P, Q, R are defined by 
(18) P=BC,—BCi, Q@=CiA2—C2A1, R= — 


Finally, the curvature 1/p: and the torsion 1/7; at a point of the curve C, 
can without difficulty be shown to be given by the formulas 


(19) 


| 
dX = x? ds 
ds; ds, 3 
4 
| 
1 P? 
pr (> A?)* 
— 2 2 2 |e i 
P?2 


700 E. P. LANE [July 


The direction cosines of the tangent, principal normal, and binormal at a 
point Q of the curve C,, referred to the moving trihedron of the curve C at 
the corresponding point P, can now be found by the familiar equations of 
transformation of direction cosines. For example, the direction cosines of the 
tangent of C,, referred to the fixed coordinate system, are known to be 


dX 4dY dZ 
(20) 

ds, ds; ds; 
Therefore, by equations (4) and the first of (14), the direction cosines of the 
tangent referred to the moving trihedron are found to be 


(21) 

ds; ds, ds, 
Similarly, the direction cosines of the principal normal of C, in the fixed co- 
ordinate system are known to be 

a*X d*y 


22 — 
( ) © ds? ds? 


as? 
and in the moving coordinate system are found to be 

(23) - al, pM, pil. 

Finally, the direction cosines of the binormal of C, in the fixed coordinate sys- 


tem are known to be 


(24) 


dY aY dZ 


ds, ds? ds? 


and two similar expressions; hence these direction cosines in the moving co- 
ordinate system are 


(25) ( ds )P ( ds yo ( ds R 

The direction cosines of the tangent, principal normal, and binormal of the curve 
Ci, referred to the moving trihedron of the curve C, are therefore respectively pro- 


portional to 
(26) Ai, Bi, Ci; L, M, N; P,Q, R. 


The general theory just outlined is capable of extensive applications. It 
forms a powerful tool for the study of curves which are transforms of a 
given curve, such as involutes, evolutes, parallel curves, and so on. But limi- 
tations of space do not permit inclusion of such developments here. 


) 
| 
ig 


1934] THE MOVING TRIHEDRON 701 


3. Surfaces. First of all, some preliminary formulas in surface theory 
will be collected for subsequent use. Let us consider a real proper analytic 
surface S, not a sphere or a plane, whose parametric equations in a fixed co- 
ordinate system are 


(27) x(u,v), y= y(u,v), 2 = 2(u, 2). 

Let the lines of curvature be the parametric curves on the surface S, so that 
(28) F=0, D’=0, 

in the classical notation of Eisenhart and Bianchi. The direction cosines a‘, 
6", y“ of the u-tangent at a point P(x, y, z) of S are given by the formulas 

(29) 

Similarly, the direction cosines a’, 8’, y’ of the v-tangent of S at P are given 
by 


x Ze 
(30) a ’ 


and the direction cosines a, b, c of the normal of S at P by 


(31) Vou ZuXy — XuVv — XoVu 
(EG)}!2 (EG)!!? (EG)*/2 


The curvilinear parametric equations of any curve C through the point P 
on the surface S are 


(32) u=u(s), v = 


the parameter s being the arc length measured from some fixed point of C. 
The direction cosines a, 8, y of the tangent of C at P are expressed by the 
formulas 


(33) a = xu’ + 2,0’, B= yuu’ + yo’, y = + (u’ = du/ds,---). 


Let 6 be the angle from the positive half of the w-tangent to the positive half 
of the tangent of the curve C at the point P. Then one has 


(34) cos 6 = E/2y’, sin @ = 


The principal normal curvatures 1/R, 1/R: of the surface S are given by 
the formulas 


(35) 


| 

4 

1 D 

R E R G 


702 E. P. LANE 


and the geodesic curvatures 1/7:, 1/re of the lines of curvature by 


1 1 Gu 
(36) —=- —= 


the subscript 1 in each case denoting the function associated with the u-curve, 
and 2 that with the v-curve, at a point P. 

Formulas analogous to the Frenet formulas can be established for the 
local trihedron whose edges are the tangents of the lines of curvature and 
the normal at a point P of a surface S. These formulas express the derivatives, 
with respect to the arc length s of a curve C, of the direction cosines of the 
three edges of the local trihedron linearly in terms of these cosines themselves, 
the coefficients depending upon the functions 0, Ri, Re, 71, 72. In fact, actual 
calculation, the details of which will be omitted, leads to the formulas in 


question, namely, 
cos@ sin @ cos 6 
(a“)’ = + a” + 


Lal T2 1 


cos@ sin @ sin 0 
(a’)’ = — + + a, 


2 


a’ (a’ = da/ds,-++-). 


and similar formulas for the remaining derivatives. With these should be 
associated the easily verified result 


(38) x’ = cos 0a“ + sin 0a’, 


with similar expressions for y’, 2’. 

Let us establish a local coordinate system at a point P of a surface S, 
referred to its lines of curvature, with the origin at P, the £-axis along the 
u-tangent, the y-axis along the v-tangent, and the ¢-axis along the normal of 
S at P. The equations of transformation between the coordinates X, Y, Z 
of any point Q (supposed to be functions of u, v, and referred to the fixed co- 
ordinate system) and the local coordinates £, n, ¢ of Q are 


X= x+ + an + af, 
(39) Y=y+ BYE + + OE, 
Recursion formulas exactly analogous to those in §1 can be obtained by re- 


peated differentiation of these equations. Differentiating once with respect 
to the arc length s of the curve C we find, by means of (37), (38), 


[July 
cos 0 sin 0 


1934] THE MOVING TRIHEDRON 
(40) X' = a“A; + a’B, + aCi 


and similar formulas for Y’, Z’, in which the coefficients A, B,, C; are de- 
fined by 


A, = cos 0(1 sin 0—- + ¢’, 


1 


B, = cos + sin + £ 
T2 


A second differentiation, followed by appropriate reduction, gives 
(42) x = + + aC», 
where Ae, Bz, C2 are defined by 


Bi 
— sind— Aj, 


Ai A; Ci 
B, = cos@— + sin — — —})+ Bi, 


T2 2 


2 R, 


In general we find 
(44) X™ = + a°B, + 


where the local components A,, B,, C, of the derivative vector X¥(™, VY, 
Z‘ are given by the recursion formulas 


) — sin 0 + Agi, 


72 


Ani 
B, = cos 0 + sin 0 


C 


n R 
With the definitions of the functions A,, B,, C, employed in this section, 
the formulas (12), - - - , (26) of §2 can easily be shown to be equally valid 
for the local trihedron of surface theory. One thus obtains a theory differing 


‘i 
B C 
4, = -—- 
B,- 
R, 


704 E. P. LANE [July 


from that of §2 only in two particulars; namely, the curve C is now supposed 
to lie on a given surface; and a different local trihedron is now being asso- 
ciated with the curve C. These considerations will not be pursued further 
here. 

The principal interest in the theory of the moving trihedron in surface 
theory arises when the point P, instead of tracing a curve C on the surface S, 
is allowed to vary over a suitably restricted region of S. In this case the local 
comporents of the partial derivative vectors are required. These may be ob- 
tained by specializing equations (37), (38), and (40), - - - , (45), if.it is kept 
in mind that 


(46) = E'!?du, ds* = 


where s“, s* denote arc lengths on the parametric curves. The required formu- 
las can also be calculated directly. Either way one finds 
(47) = A*a* + + Ca, X, = A*a* + Bra’ + Ca, 
and similar formulas for the first partial derivatives of Y, Z, where the coeffi- 
cients A“,---,A*,-+- are defined by the formulas 

A® 

B g Ne 

Guz 

$e 


Gi/2 


Further differentiation, followed by appropriate reductions, yields 
= A**a* + Bra? + Ca, 

(49) Xu» = + Ba’ + Ca, 
Xoo = + Bra” + Cra, 


and similar formulas for the second partial derivatives of Y, Z, where 
As 
Ei 
Bus 


THE MOVING TRIHEDRON 


A uv Aw 

Gue + 

Bw Bw 

Gil2 Et 

Cc cv 

Bil 

A” 

G2 

Br 

Cc’ 

Gil2 
The calculation of the local components of derivative vectors of higher order 
than the second is now purely mechanical, but none of them will be used 
hereinafter, and recursion formulas for the local components of the derivative 
vectors of the mth order need not be written. 

Let us suppose for the present that the locus of the point Q, when wu, v 
vary, is a proper surface S,, and let the six fundamental coefficients and other 
functions for this surface be indicated by subscripts 1. For the first three funda- 
mental coefficients we find, by easy calculations from equations (47), 

(51) Ei = DA“, Fi= = DA", 
whence 
(52) H? = E,G, — F? = >>(B“C’ — BC*)?. 


The direction cosines of the u-tangent at a point of the surface S;, referred to 
the moving trihedron of the surface S at the corresponding point P, are 
found to be 


A Cc 


(53) , 
Ei Ei 
1 1 1 


and similarly the direction cosines of the v-tangent are 


A’ Br 


(54) ’ ’ ’ 
Gi/2 G2 
1 1 1 


while the direction cosines of the normal of S; are 


1934] 705 


706 E. P. LANE 


1 1 1 
55 —(B»C’ — B°C“), —(C“A* — C*A%), —(A“B* — 


Finally, the second fundamental coefficients for the surface S, are found 
to be given by 


1 

D, = — uu(BuC? — BC), 
A, 
1 

Di = — — BC’), 
A, 


1 
Di’! = — — BC"). 
A, 


Since the six fundamental coefficients for the surface S$, have been cal- 
culated, it is only a formal matter to write the expressions for the mean and 
total curvatures, the equation of the lines of — etc., for the surface 
S, in terms of the components A”, ry 

4. Applications. Some pean of the home of the moving trihedron 
in the theory of surfaces, as explained in the preceding section, will now en- 
gage our attention. First of all, equations (47) and the similar equations for 
Y, Z show that the point Q(X, Y, Z) is fixed relative to the fixed coordinate 
system if, and only if, 


A‘ = BY Av = BP =C*=0. 


Equations (48) now yield necessary and sufficient conditions that the point Q 
be fixed relative to the fixed coordinate system, namely, 


Ri 


g 
.= a} = =). 
Ri Ss 


These conditions are very useful in solving envelope problems of a type which 
will now be described. Let us consider a surface S referred to its lines of curva- 
ture, and a two-parameter family of surfaces such that one of them, Si, is 
associated with each point P of S. Let the equation of S; be 


u, v) = 0, 


in which é, 7, ¢ are local coordinates referred to the moving trihedron of S 


[July 


1934] THE MOVING TRIHEDRON 707 


at P, and u, v are the curvilinear coordinates of P. It may be required to find 
the envelope of the surface S; when the point P describes a curve or a region 
of the surface S. The usual method of investigating the envelope entails the 
differentiation of the functions &, 7, ¢ with respect to u and 2, and it will next 
be shown that the conditions (57) are precisely the needed formulas for the 
differentiation of local point coordinates. For this purpose, let us observe that 
the result of solving equations (39) for é, n, ¢ is 


§=a(X — x) + — y) + 7*(Z — 2), 
(58) n= a°(X — x) + — y) + 7°(Z — 8), 
= a(X — x) — y) +c(Z —2). 


Consequently the equation of the surface S, referred to the fixed coordinate 
system can be written in the form 


j( Dex x), x), x), u, 0, 


the summation being for cyclical permutations. Since u, v occur explicitly 
and also in a“, a’, a,x, -- - , but not in X, Y, Z, partial differentiation yields 


+ fanu + + fu = 9, 
+ + + fo = 9, 


where the partial derivatives of &, 7, ¢ are to be calculated from equations 
(58) by direct differentiation with X, Y, Z fixed. If use is made of equations 
(37), suitably specialized, to obtain the partial derivatives of a“, a, a, + -- 
as linear combinations of a“, a, a,---, and if equations (58) themselves 
are then employed to express the derivatives of £, n, ¢ as functions of &, 7, ¢, 
the result of the differentiation can be reduced to equations (57), as was to 
be shown. 

By way of illustration let us consider the osculating plane of the u-curve 
at the point P of the surface S. If the equation of this plane, referred to the 
fixed coordinate system, is written in the usual form, the equations of trans- 
formation (39) and the equations (29) together with the equations obtained 
by differentiating the latter with respect to « can be used to show that the 
local equation of the osculating plane of the u-curve is 

n ¢ 


60 ——-—=0, 
(60) 


(59) 


If this equation is differentiated with respect to v, the result can be reduced 
by means of one of the conditions of Codazzi, namely, 


? 


708 E. P. LANE 


(61) (+ Gil2 ( 1 
=) ri 


to 
12 Gil2 T1/ v 


provided that the surface S is not developable. Equations (60), (62) taken 
together are the equations of the characteristic of the osculating plane of the 
u-curve when v varies. The equations of the orthogonal projection of this 
characteristic onto the tangent plane are 


ry \n 


Since the equations of the ray of the lines of curvature, namely, the straight 
line joining the Laplace transformed points or ray-points (0, 7:, 0) and (—re, 
0,0), are 
n 

(64) 

T2 
it follows that the orthogonal projection of the characteristic of the osculating 
plane of the u-curve, when v varies, onto the tangent plane coincides with the ray 
if, and only if, 


1 1 1 
(65) (-) 
R,R2 

Differentiation of equation (62) would enable us to find the edge of regression 
of the developable enveloped by the osculating plane of the u-curve when v 
varies. 

The equation of the rectifying plane of the u-curve at the point P can easily 
be shown to be 

since this plane must contain the tangent line, 7=£=0, and must be per- 
pendicular to the osculating plane (60) of the w-curve. The equations of the 
characteristic of this plane when v varies can be found by the method just 
used for the osculating plane. The equations of the orthogonal projection 
of this line onto the tangent plane turn out to differ from equations (63) only 
in that the sign of » has been changed. Therefore the projections onto the tangent 
plane of the characteristics of the osculating plane and rectifying plane of the 


1934] THE MOVING TRIHEDRON 709 


u-curve, when v varies, are symmetrically placed with respect to the tangent line 
of the u-curve. 

The machinery of the local trihedron can be efficiently used to investigate 
the focal surfaces of the congruence of normals of a surface S, and the Laplace 
transformed nets of the lines of curvature on S, but as the principal results 
are well known, this study need not be entered upon here. It may be worthy 
of comment, however, that it is easy to locate the centers of the osculating 
circle and osculating sphere of the u-curve. Differentiating the equation §=0 
of the normal plane of the w-curve with respect to u we obtain the equations 
of the polar line of the u-curve at the point P, namely, 


(67) 


R, 


This line intersects the osculating plane (60) in the center of the osculating 
circle of the u-curve, whose coordinates are thus found to be 


the radius of curvature p; of the u-curve being given by 


(68) 

pr 7 re R? 
The polar line meets the surface normal, = =0, at the center of the princi- 
pal normal curvature corresponding to the u-curve (0, 0, Ri), and meets the 
v-tangent, £=¢=0, at the ray-point of the w-curve (0, 7:, 0). A second differ- 
entiation with respect to u and solution of three simultaneous equations yield 
the coordinates of the center of the osculating sphere of the u-curve, namely, 


TiPlu TiPlu 


the torsion 1/7; of the u-curve being given by 


1 1 1/1 1/1 
pr Ritn/u \Ri/u 
The usual formula for the radius of the osculating sphere, in terms of 1, 
71, and their derivatives with respect to the arc length of the w-curve, could 
easily be used to write down a condition necessary and sufficient that one 


family of lines of curvature, namely, the u-curves, on a surface be spherical. 
Similar results can be obtained with the u-curves and v-curves interchanged. 


2 2 
ak 
Ri 


710 E. P. LANE 


Necessary and sufficient conditions that the surface S, generated by the point 
Q may be obtainable from the surface S by a translation can be found in the 
following way. In case these surfaces differ only by a translation, the differ- 
ences X—x, Y—y, Z—z are constants. Differentiating equations (39) under 
this assumption we find the required conditions, namely, 


A* = A’ = 0, 
(70) «= 0, Be = 

= 0, c°=0. 
These conditions are equivalent to the conditions (57) with the modification 


that the terms consisting of the number —1 must be deleted from the paren- 
theses in the formulas for é,, 7, therein. 


UNIVERSITY OF CHICAGO, 
Cuicaco, Itt. 


BY 
a 
3 
f 


3 
4 


PROPERTIES OF FUNCTIONS f(x,y) OF BOUNDED 
VARIATION* 


BY 
C. RAYMOND ADAMS AND JAMES A. CLARKSON 


1. INTRODUCTION 


In a recent paper we investigated the relations between several defini- 
tions of bounded variation for functions f(x, y) of two real variables.t These 
definitions are usually associated with the names of Vitali, Hardy, Arzela, 
Pierpont, Fréchet, and Tonelli respectively; we proved the equivalence of the 
definition formulated by Pierpont and the modified form of it given byHahn.§ 

Since the several definitions were assembled in CA, it is hardly necessary 
to repeat them here. But we shall again denote the classes of functions satisfy- 
ing the respective definitions by V, H, A, P, F, and T. In addition the class 
of functions continuous in (x, y) will be designated by C, the class of func- 
tions belonging to the Baire classification by|| B, the class of functions having 
measurable total variation functions] and by and the class 
of functions having superficial measure by M; and the common part of two 
or more classes will be indicated by the product of the corresponding letters. 
The domain of definition of f(x, y) is generally to be understood as a rectangle 
with sides parallel to the axes** (a <x <b,c <y<d); the letter R, with or with- 
out a subscript, will always stand for such a rectangle. 

Functions g(x) of bounded variation are of great interest and usefulness 
because of their valuable properties, particularly with respect to additivity, 
decomposability into monotone functions, continuity, differentiability, meas- 

* Presented to the Society, December 27, 1933; received by the editors April 30, 1934. 

t Clarkson and Adams, On definitions of bounded variation for functions of two variables, these 
Transactions, vol. 35 (1933), pp. 824-854. Hereafter this paper will be referred to as CA. 

t Since the paper CA was written, our attention has been called to two additional definitions; 
of these the first is due to Wiener, Laplacians and continuous linear functionals, Acta Szeged, vol. 3 
(1927), pp. 7-16. The second is that of Nalli and Andreoli, Sull’ area di una superficie, sugli integrali 
multipli di Stieltjes e sugli integrali multipli delle funzioni di pin variabili complesse, Accademia dei 
Lincei, Rendiconti, (6), vol. 5 (1927), pp. 963-966. The fact that class T- C contains as a proper sub- 
class all continuous functions satisfying the definition of Nalli and Andreoli or a modified form of it 
has been shown by Tonelli, Sulla definizione di funzione di due variabili a variazione limitata, ibid., (6), 
vol. 7 (1928), pp. 357-363. In this sequel to CA these additional definitions will not be further con- 
sidered. 

§ This will be spoken of as the Py-form of Pierpont’s definition. 

|| This must not be confused with B in CA, which stood for the class of bounded functions. 


{ ¢(2)[y(9)] represents the total variation of f(#, y)[f(x, 9) ] in y[x]; see CA. 
** For brevity we shall sometimes indicate such a closed rectangle by the notation (a, ¢; b, d). 


711 


TOLLEGE OF LISERAL ARTS 
LIBRARY 


y 


= 
aa 
4 
. 
4 
| 
+ 
] 
| 
4 
| 
if 
a 


712 C. R. ADAMS AND J. A. CLARKSON {October 


urability, integrability, etc.; and it is largely to the possession of these prop- 
erties that such functions owe their important role in the study of rectifiable 
curves, Fourier and other series, Stieltjes and other integrals, and the cal- 
culus of variations. Proposers of definitions of bounded variation for functions 
f(x, y) have been actuated mainly by the desire to single out for attention a 
class of functions having properties analogous to some particular properties 
of a function g(x) of bounded variation. It has long since become apparent 
that to preserve properties of one sort the defiaition of bounded variation for 
g(x) should be extended to f(x, y) in one way, while to preserve properties of 
another sort a quite different extension may be needed. 

It is natural that in CA the only detailed study of properties of functions 
f(x, y) belonging to the several classes V, H, A, P, F, and T should have had 
to do with the nature of the total variation functions ¢(#) and y(¥), since 
properties of this kind seemed to bear most directly upon the problem of de- 
termining relations between the classes. Properties of functions belonging to 
the classes V and F (and by implication H) with respect to double Stieltjes 
integrals of the Riemann type have recently been examined by Clarkson.* 
It would seem worth while to make a systematic study of the properties of 
additivity, decomposability, etc., enjoyed by functions belonging to each of 
the six classes,and it is to this object that the present paper is mainly de- 
voted. The determination of such properties has by no means been utterly 
neglected by previous writers; indeed we shall state a few results that are al- 
ready well known, and certain of our theorems will constitute extensions of 
such results. 

It will appear that the aggregate of functions in class T lacks certain de- 
sirable properties because of the necessity for ¢(#) and ¥(#) to be measurable. 
And the evidence seems to indicate that the definition of Tonelli, precisely 
as formulated by him, may attain its greatest usefulness when applied to 
functions which to a certain extent are well behaved, perhaps to the extent 
of belonging to the Baire classification. In order that a function f(x, y) may 
not fail to be included in the class merely because its ¢ or y is non-measurable, 
we define the extended class T to consist of those functions f for which $ and y 
are respectively dominated by summable functions; this class we designate by T. 
Such extension of Tonelli’s class has proved desirable in recent work by 
Gergenf and by Morrey.f 

* Clarkson, On double Riemann-Stieltjes integrals, Bulletin of the American Mathematical 
Society, vol. 39 (1933), pp. 929-936. 

t Gergen, Convergence criteria for double Fourier series, these Transactions, vol. 35 (1933), pp. 
29-63. 


t Morrey, A class of representations of manifolds. 1, American Journal of Mathematics, vol. 55 
(1933), pp. 683-707. 


# 
“4 
My 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 713 


Throughout this paper the difference operators Aio, Aoi, and Ay, when 
applied to f(x;, y;), will have the following meaning: 

Arof(xi, = — f(s, 

Aoif(«i, ¥i) = f(s, Vier) — 

Auf(xi, ¥i) = Aro(Aoif(%s, 
When applied to f(x, y), the operators will have a similar significance, it be- 


ing understood that the increments of x and y involved are greater than zero 
but otherwise arbitrary. 


2. A PROPERTY OF CLASS P 


THEOREM 1. If f(x, y) is in class P, $(%)[W(9) | ts dominated by a summable 
function. 


For each n2=1 let N, designate the net of m? cells used in the Py-form 
of the definition, and denote by ¢,*(#) the sum of the oscillations of f in 
the cells of that column in whose base £ lies. For definiteness we may associate 
; z, when it is the coordinate of a line of V,, other than x =a, with the subinter- 
% val of (a, b) whose right-hand end point is z. Then ¢,*(2) is a step-function 
and, if B denotes a bound for the Pqg-sum, we have for each n 


(1) ba*(x)dx = < Bib — a). 


For each £ let 


¢*(Z) = lim inf ¢,*(Z); 


in the light of (1) it is knownf that #*(#) is summable in (a, b). Next let 


¢,**(@) = [oscillation of f(Z, y) in the interval y,1 y S 


i=1 
for each , and set 


¢**(Z) = lim inf ¢,**(Z). 


For every # in (a, 6) for which $() is finite we havet 
(2) o**(£) = $()/2, 


+ See Schlesinger and Plessner, Lebesguesche Integrale und Fouriersche Reihen, Berlin, 1926, p. 91. 
t See Hobson, Theory of Functions of a Real Variable, 3d edition, vol. 1, Cambridge, 1927, p. 331. 
It is easily proved that the total variation and total fluctuation of any function g(x) are equal when 
both are finite, and that if either is infinite the other is likewise. 


} 
: 
} 
4 
| 
| 
| 
| 
| 
| 
| 
‘ 
> 
3 
3 
: 
= 
4 
a 


714 C. R. ADAMS AND J. A. CLARKSON [October 


and it is easily seen that when ¢(Z) is infinite, ¢**(z) is likewise. Moreover, 
for each # we have 
n(Z) = on'*(Z) 
for all m, whence 
(3) $*(Z) = o**(Z) 


except when ¢$**(#) is infinite, in which case $*(2) is also infinite. The theo- 
rem for ¢() now follows from inequalities (2) and (3); a similar proof may 
be given for 

Coroiiary 1. /f f(x, y) is in class P and $(%) and W(¥) are measurable,t 
I(x, y) is also in class T. 

The common part of the overlapping classes P and T may now be specified 
by the relation P-T=P-M,,y. 


Corotiary 2. If f(x, y) is in class P, 6(%)[W(9)]| is finite almost every- 
where.t 


That ¢ may be infinite at an everywhere dense set§ and that df/dx and 
df/dy may fail to exist (finite or infinite) at a set everywhere dense in the 
rectangle R, when f is in P, is illustrated by the following example. Let the 


rational points in the interval 0<*x<1 be enumerated as x, x2, - - - ; for 
x=x,(n=1, 2,---) and y rational (0<y<1) let f(x, y) =1/2*; elsewhere 
in the unit square I (0, 0; 1, 1) let f=0. 

From Theorem 1 we have T=P; the relation T7>P then follows from 
example (D) of CA. The fundamental relations of inclusiveness between the 
several classes are therefore 


(4) T>P>A>4H, F>V>4A, T>T>G; 


and when only functions belonging to the Baire classification are admitted to 
consideration], 


t Montgomery, Properties of plane sets and functions of two variables, to appear in the American 
Journal of Mathematics, Theorem 17, has shown that f C B implies measurability of ¢ and y. 

¢ Although Theorem 2 of CA was sufficient for the purposes of that paper, this corollary im- 
proves the result. 

§ This first fact was illustrated by the example following the proof of Theorem 2 in CA, but the 
example given here is somewhat more easily shown to be in P. 

|| These relations are an immediate consequence of the results of CA in conjunction with Mont- 
gomery’s Theorem 17, loc. cit. From the standpoint of continuity we may remark that the inclusive- 
ness relations are like (5) when only functions possessing one of the following properties are admitted 
to consideration: continuity in (x, y) [see CA], continuity in x and in y, semi-continuity in (x, ¥), 
upper semi-continuity in one variable and lower semi-continuity in the other. A function having 
this last property belongs to Baire’s class 1 at most; see Kempisty, Sur les fonctions semicontinues par 


1934] 


(5) F-B>V-B>H-B. 


These are the basis for numerous statements in the following pages. 


FUNCTIONS f(x, y) OF BOUNDED VARIATION 


T-B>P-B>A-B>H-B, 


3. CLOSURE OF THE SEVERAL CLASSES UNDER ARITHMETIC OPERATIONS 


THEOREM 2. Each of the classes V, H, A, P, F, and T is closed under addi- 
tion (and subtraction).* This is not truef of T. 


The first part of this theorem is an immediate consequence of the defini- 
tions. For the second part we may break up example (C) of CA into mono- 
tone components as follows: E being a linearly non-measurable set of points 
on the downward-sloping diagonal d of the unit square I (0, 0; 1, 1), set 


0 below d, 
h(x, y) 1 above d, fal 0 below and on d, 
x, = x, = 
1 on E, 1 above d. 


0 elsewhere on d. 


Each of these functions is clearly in 7, although f,—f2, which is example (C) 
of CA, is not. 


THEOREM 3. Each of the classes H, A, and P is closed under multiplication.t 
This is not true of V, F, T, or T. 


For H and A the theorem may readily be proved by aid of decomposition 
theorems given in §4. Since P contains only bounded functions, the proof for 
P flows at once from 


Lema 1. Let f; and fz be functions of any number of variables, defined for 
an arbitrary range of variation S of those variables. If f; and f, are bounded, 
and the least upper bound of | f;| is denoted by B,(i=1, 2), the following inequal- 
ity connects the oscillations of fi, f2, and f,-f2 over S: 


Osc fe) Be Osc fi + B, Osc fe. 


rapport & chacune de deux variables, Fundamenta Mathematicae, vol. 14 (1929), pp. 237-241. On the 
other hand, when only functions upper [lower] semi-continuous in each separate variable are ad- 
mitted, the relations are like (4); see example (C) of CA. 

* For H this fact was observed by Hardy, On double Fourier series, and especially those which 
represent the double zeta-function with real and incommensurable parameters, Quarterly Journal of 
Mathematics, vol. 37 (1905), pp. 53-79. 

+ It is quite clear that if f,(x, y) and fo(x, y) are both in 7, f=/i+/2 will fail to be in T when and 

only when at least one of its total variation functions is non-measurable. By Theorem 17 of Mont- 

3 gomery, loc. cit., this cannot happen if only functions belonging to the Baire classification are ad- 
3 mitted to consideration. 

t For ZH this fact was observed by Hardy, loc. cit. 


; jeg 
| 
| 
| 
4 
3 


716 C. R. ADAMS AND J. A. CLARKSON [October 


Designating by a; and }; respectively the greatest lower and least upper 
bounds of f;(i=1, 2), one may easily construct a proof of the lemma by con- 
sidering seriatim all possible cases for the relationship of the intervals (a;, };) 
to the origin. 

That the product of two functions in V-C may not even be in F is seen 
at once from the following example: 


x sin (1/x) for x > 0 


0 for x = 0 in the unit square I. 


y) 
(6) 
fo(x, y) =? 


The theorem fails for T because the product of two functions in T may 
have a non-measurable ¢ or y; viz., 


1 for x in £, y = 0, 
fil, ¥) = 1 for x in C(E), y = I, 


0 elsewhere, 


1 for y = 0, 
0 for y > 0, 


fo(x, y) 


in I, EZ being a non-measurable set in the interval (0, 1) and C(Z£) its comple- 
ment. The theorem also fails for 7, and likewise for 7, because of the well 
known theorem of Lebesgue: if g:(«) is a summable function not essentially 
bounded, there always exists a summable function g2(x) such that gi-g2 is not 
summable over the interval considered. We may consider the interval in question 
as (0, 1) and set in I 

g(x) for y = 0 


= { (i = 1, 2). 


Remarks. That the relations fic V-C, f2¢ H-C do not imply fi-f2¢ F is 
shown by (6). That f,¢ 7, f2¢ H do not imply f,-f2¢ T is apparent from (7); 
nevertheless one may readily show by aid of Lemma 1 and the theorem of 
Montgomery referred to above that if both f, and f, are in 7, are bounded, 
and belong to the Baire classification, f,-f, must be in 7; similarly, if f; and f. 
are in T and are bounded, f,-/: is in T. That f, may be in A-C[P-C or T-C] 
and f: in H-C without f,-f, being in H[A or P respectively] is clear from the 
fact that f(x, y)=1 isin H-C. 


(7) 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 717 


THEOREM 4. Each of the classes H, A, and P is closed under division, the 
denominator being assumed bounded away from zero.* This is not true of V, F, 
or T. 


In the light of Theorem 3 it suffices, for the first statement, to consider 
the case of 1/f for f in the class in question and | f| =m>0. The fact has been 
stated for H by Hardy, loc. cit.; a proof can be constructed by aid of a little 
double series technique. For A one may give a proof precisely like that of the 
corresponding theorem for a function g(x) of bounded variation. 

Proof for P. Let 8 be a bound for the Py-sum for f, and for each x let 
a, be the number of cells, in the net of ? cells, in which f changes sign; then 


(8) BE (f)/n = 2may/n. 
Let us set 
De (1/f) = + 


vol 


>’ representing the sum over the cells in which f changes sign and }>’’ the 
sum over the remaining cells. In each cell of the first set we have 


(9) wy (1/f) S 2/m; 


denoting by M, and m, respectively the least upper and greatest lower bound 
of |f| in the vth cell, we have for each cell of the second set 


(10) ws (1/f) = 1/m, — 1/M, = M, — m,)/m? = ow; (f)/m?. 
From (8), (9), and (10) follows the inequality 


(1/f)/n < 


for every n, and the proof is complete. 

That f may be in V and |f| =m>0 without 1/f even being in F is seen 
from the following example. Let I be divided into subrectangles by the lines 
x=1-—1/n (n=2, 3, - - - ). Proceeding from left to right, in the first, third, 

- - rectangles let f=1 except along the (closed) top and right-hand side; 
on the entire top and right-hand side, except at their common point where 
f=3, let f=2. At points of the even-numbered (closed) rectangles not already 
considered define f as 2 except along the top, where f=3. For x=1 and all y 
let f=1. 

If f is in T and |f| =m>0 one readily sees that 1/f can fail to be in T 


* The necessity for imposing this restriction is clear, since H, A, and P contain no unbounded 
functions. 


‘ ¢! 
< 
4 
yal] 
4 
i 
‘4 
+ 
¥ 


718 C. R. ADAMS AND J. A. CLARKSON [October 


only if its ¢ or y is non-measurable. This situation occurs in the case of 
3/2 for x in E, y = 0 
f(x, y) = 1/2 for x in C(Z), y = 1 } in I, 
1 otherwise 


E being a non-measurable set. 

Remarks. Since V, F, and 7 contain unbounded functions, the restriction 
that the denominator be bounded away from zero in connection with these 
classes is perhaps more than would normally be expected. That 7 is not 
closed under division is apparent from the following example: fi(x, y) =1/x'/? 
and fo(x, y) =x"/? for y=0, x >0; and both functions equal to 1 elsewhere in I. 
If consideration is restricted to bounded functions in 7, it is readily seen that 
this subclass of T is closed under division, the denominator being assumed 
bounded away from zero (see Remarks following Theorem 3). 


4. RELATIONSHIPS WITH MONOTONE FUNCTIONS; DECOMPOSITION 


THEOREM 5. A necessary and sufficient condition that* f(x, y) be in class V 
is that it be expressible as the difference between two functions, fi(x, y) and 
falx, y), satisfying the inequalities 


Anfi(x, y) => 0 (i = 1, 2). 


The necessity has essentially been shown by Hobson?; the sufficiency is 
quite clear from Theorem 2. 


THEOREM 6 (Hardyf). A necessary and sufficient condition that f(x, y) be 
in class H is that tt be expressible as the difference between two bounded functions, 
filx, y) and f(x, y), satisfying the inequalities§ 


Anof.(x, y) 20, Aoifi(x, y) 2 0, Aufi(x, y) 20 (i = 1, 2). 


THEOREM 7 (Arzela||). A necessary and sufficient condition that f(x, y) be 
in class A is that it be expressible as the difference between two bounded functions, 
filx, y) and fo(x, y), satisfying the inequalities 


* In order that the V-definition may always have meaning it is to be understood here that f, 
although perhaps unbounded, is everywhere finite. The functions f;, fz are of like character. 

t Hobson, loc. cit., p. 345. 

t Hardy, loc. cit. 

§ Functions satisfying these inequalities have been called “monotonely monotone” by W. H. 
and G. C. Young, On the discontinuities of monotone functions of several variables, Proceedings of the 
London Mathematical Society, (2), vol. 22 (1923), pp. 124-142. They belong to the class of “quasi- 
monotone” functions as defined by Hobson, loc. cit., p. 347. 

|| Arzela, Sulle funsioni di due variabili a variazione limitata, Bologna Rendiconto, (2), vol. 9 
(1904-05), pp. 100-107. 


4 
j 


FUNCTIONS f(x, y) OF BOUNDED VARIATION 719 


Arfi(x, y) 2 0, Anifi(x, y) 2 0 


Remarks. Although every bounded function monotone in the sense of 
Hobson is in class A - T, not all such are in H. Every function quasi-monotone 
in the sense of Hobson is in class V and if bounded is also in H. 


(4 = 1, 2). 


THEOREM 8. Every bounded *function non-decreasing in each of two direc- 
tions is in class P. 


Let a and £ respectively (a<) be the angles made with the positive 
x-axis by the given directions in which f is non-decreasing. Fora=0,8=7/2a 
proof has been given by Hahntf, in establishing the relation P2A. We now 

prove the theorem for 0<a<f<7/2; it will be clear that the method is ap- 

y plicable in all cases. 

¥ Using the Py-form of the definition, let a net of ? cells be placed upon R, 
and let the columns of cells be numbered from left to right and the rows from 
bottom to top. Indices 7, 7 may then be employed to designate the cell in the 
ith row and jth column. With this cell (for each pair of values i, 7) we asso- 

ciate two points p;; and q;; defined as follows (see accompanying figure): 


ml 
r 


PP; 


qij 


pis[g:z] is the point from which the cell is seen under the angle B—a, the 
sides of the angle having the directions +a, +8 [a, 8]. Let 9i+%,;41 be the 
point (or a point) of the set g;; lying in the closed sector marked I in the 
figure and at a minimum distance from p;;. The integers k, | are now fixed and 
are clearly independent of n. 


* If the directions of assumed monotonicity are axial (i.e., the function is monotone in the sense 
of Hobson), finiteness of the function everywhere implies boundedness; otherwise this may not be so. 
{ Hahn, Theorie der Reellen Funktionen, Berlin, 1921, p. 546. 


1934] 
| i 
| 
a 
4) 
4 
4 
4 


[October 


720 C. R. ADAMS AND J. A, CLARKSON 


It suffices to consider n=>max (10%, 102). For such a value of » the sum 
of the oscillations of f in the first and last k rows and the first and last / 
columns of cells is <2B-2kn+2B-2In=O(n), B being a bound for |f| in R. 
For each remaining cell the associated points 9,;, gi; lie within R. These re- 
maining cells constitute a block of (n—2k)(m—2I) cells, and it will simplify 
matters a little to regard the row indices of these cells as running from 1 to 
n—2k and the column indices from 1 to n—2/. From the above choice of k 
and / we clearly have 


S (pis) S 540) 


Hence, for this remaining block of cells, the sum of the oscillations of f is 


— 3k;j7 =1,2,---,” —3l). 


n—2k,n—21 


t,j=l 


t=n—3k+1 j=l 


IIA 


IIA 


| Kae 


j=1 t=k+1 j=1 


S [(m — 2k)l + k(n — 31) + h(n — 21) + (n — 3k)IJB = O(n). 


This completes the proof for the case considered. 

It may be noted that a function non-decreasing in two directions must be 
non-decreasing in any third direction lying in the angle (<7) formed by the 
first two. Therefore, in constructing a proof for other cases, one may always 
reduce a case in which B—a is >z/2 to a case in which B—a is <7z/2, of 
which that considered above is typical. 

Remarks. It would be of considerable interest to determine whether a 
function in class P can always be decomposed into the difference between two 
functions each of which is bounded and monotone in two directions. If this 
were true it would follow at once* that every function in P has a total differ- 
ential almost everywhere, settling a question left open in $6. 

Lebesgue has defined a function f to be monotone if it satisfies the follow- 
ing condition: p being any point of the region considered and € any closed 
curve in this region containing ? in its interior, we have g.1.b. of f on © <f(p) 
<].u.b. of fon G. It is easily seen by examples that not all functions satisfying 


*See Haslam-Jones, Derivate planes and tangent planes of a measurable function, Quarterly 
Journal of Mathematics, Oxford Series, vol. 3 (1932), pp. 120-132; or Saks, Théorie de l’Intégrale, 
Warsaw, 1933, p. 238. 


| 
| 

| 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 


this condition belong to any one of the classes V, H, A, P, F, T, and T. 

We conclude this Section with two quite obvious theorems concerning de- 
composition of a different sort, the first expressing a fact which has already 
been frequently observed. 


THEOREM 9. A necessary and sufficient condition that f(x, y) be in class V 
is that f(x, y) =f(x, y)+g(x)+h(y) where f(x, y) is in H. 

DerFinitIon. The subclass of F of which each function has $(%) and (4) 
finite somewhere (and therefore finite everywhere{) will be designated by F*. 


It should be observed that F*=F-T, the relationship of which to other 
classes was considered in CA. 


THEOREM 10. A necessary and sufficient condition that f(x, y) be in class F 
is that f(x, vy) =f(x, y) +(x) +h(y) where f(x, y) is in F*. 


5. ADJUNCTION OR SUBDIVISION OF RECTANGLES 
We state without proof two theorems. 


THEOREM 11. If a function is in any one of the several classes for each of 
two rectangles R, and Rz whose sum is a rectangle R, it is in the same class for R. 


THEOREM 12. If a function is in any onet of the classes V, H, A, P, F, T-B, 
or T for a rectangle R, it is in the same class for any subrectangle R;. This is 
not true of T. 


6. CONTINUITY, DIFFERENTIABILITY, MEASURABILITY, AND INTEGRABILITY OF 
FUNCTIONS BELONGING TO THE SEVERAL CLASSES 


THEOREM 13. If f(x, y) is in class V and f(x, 9) [f(z, y)] for some 5[#] has 
only a denumerable number of discontinuities in x|y], the discontinuities in 
x[y] of f(x, y) are located on a denumerable number of parallels to the y-axis 
[x-axis 


Let E be the set of points at which f has a discontinuity in x and assume 
the existence of a non-denumerable set S of vertical lines each containing at 
least one point of Z. Clearly only a denumerable subset of S can be made up 
wholly of points of E. Let the remaining lines of S constitute the subset 51; 
then each line of S, contains at least one point of £ and at least one point not 


t See CA, Theorem 3. 

t For H this fact was observed by W. H. Young, On multiple Fourier series, Proceedings of the 
London Mathematical Society, (2), vol. 11 (1912), pp. 133-184, especially p. 143. The failure of T 
to enjoy the property in question is illustrated by f,(x, y) in (7). 


721 
3, 
| 

| 

| 

| 


722 C. R. ADAMS AND J. A. CLARKSON [October 


in Z, and S; is non-denumerable. On each line of S, choose a point of E; at 
this point f has a positive saltus in x. This non-denumerable set of saltuses 
contains a subset whose elements are the terms of a divergent series. A net 
can therefore be placed upon & to yield an arbitrarily large V-sum; from this 
contradiction flows the theorem. 


THEOREM 14. If f(x, y) is in class V, the discontinuities in (x, y) which are 
not discontinuities in x or in y are denumerable. 


Let the oscillation at any such discontinuity (x, y:) be a; then it is clear 
that in every neighborhood of this point there exists a second point (x2, ys) 
such that Auf for the cell (1, 1; %2, y2) is >a/4. The assumption that the set 
of such discontinuities is non-denumerable then leads to a contradiction just 
as in the case of Theorem 13. 


Coroiary. If f(x, y) is in class H, the discontinuities of f(x, y) are located 
on a denumerable number of parallels to the axes.t 


THEOREM 15. Class V (and therefore F) contains boundedt functions which 
are everywhere discontinuous both in x and in y; it also contains bounded non- 
measurable functions. Class V-C (and therefore F-C) contains functions of 
which neither first partial derivative exists (finite or infinite) anywhere.§ 


Examples. The function f(x, y)=g(x)+h(y), where both g and h are 
bounded and everywhere discontinuous, has the first property specified; if g 
is bounded and linearly non-measurable and h is identically zero, f has the 
second property; if g and / are continuous but have a derivative (finite or in- 
finite) nowhere, f has the third property. 

Of course it follows that V contains functions for which the double 
Lebesgue integral over R fails to exist, and that V-C contains functions which 
are nowhere totally differentiable. Nevertheless, that every function in V for 
which f(x, c) and f(a, y) have (finite) approximate derivatives almost everywhere 
possesses an approximate total differential almost everywhere is a consequence 
of Theorems 9 and 16, in conjunction with a theorem of Stepanoff.|| 


¢ This corollary is also a consequence of Theorems 2 and 6 and results obtained by W. H. and 
G. C. Young, loc. cit. 

¢~ Unbounded functions having the same property are included also. 

§ It would be of considerable interest to determine whether the same is true of F* (see §4), 
which from one point of view is the essential part of F and which bears to F a relationship similar to 
that of H to V, or whether functions in F* possess properties of continuity, etc., more like those 
possessed by functions in H. 

|| Stepanoff, Sur les conditions de V existence de la différentielle totale, Recueil de la Société Mathé- 
matique de Moscou, vol. 32 (1925), pp. 511-526; or see Saks, loc. cit., p. 228. According to Stepanoff’s 
theorem a necessary and sufficient condition that f (C M) have an approximate total differential al- 
most everywhere in R is that f have (finite) approximate first partial derivatives almost everywhere 
in R. 


| 
| 

| 

| 

| 

| 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 723 


THEOREM 16 (Burkill and Haslam-Jones*). A function f(x, y) in class A 
is totally differentiable almost everywhere. 


THEOREM 17. A function f(x, y) in class P is continuous in (x, y) almost 
everywhere. 


Assume the set E of points at which f has a saltus 2¢>0 to have exterior 
measure k>O. Let the area of R be denoted by S and let [kn?/S] stand for 
the largest integer not exceeding kn*/S. For a net of m? cells under the Pyz- 
form of the definition, we see that at least [kn?/S] cells of the net must con- 
tain points of E; hence we have 


Ye! /n = [kn?/S |e/n, 


which is unbounded unless & is zero. Therefore, if f is in P, k must vanish for 
every e>0, and by a classical argument it follows that the discontinuities of f 
are a set of plane measure zero. 

Of course it may be inferred that the double Riemann integral over R of a 
function in P always exists}; another consequence is the relation T-M>P. 


TuEoreM 18. If f(x, y) is in class T-M, Of/dx[df/dy] exists (finite) al- 
most everywhere.t 


Since f is in M, the set E at which df/dx fails to exist (finite) is measura- 
ble.§ Since f is in 7, E is intersected by almost every line y=¥, in a set of 
linear measure zero. Hence, by Fubini’s theorem, £ is of plane measure zero. 


Coro.iary 1. A function f(x, y) in class T-M has an approximate total 
differential almost everywhere.|| 


_This follows from the theorem of Stepanoff cited above. 


Corotzary 2. If f(x, y) is in class T-M, each first partial derivative is 
L-integrablet over R. 


It is worthy of notice that the hypothesis f © M cannot be dispensed with in 
Theorem 18 and its corollaries. This may easily be shown by example as fol- 


* Burkill and Haslam-Jones, Notes on the differentiability of functions of two variables, Journal of 
the London Mathematical Society, vol. 7 (1932), pp. 297-305; see also Haslam-Jones, loc. cit. 

t See Hobson, loc. cit., p. 477. 

t Theorem 18 and Corollary 2 are extensions of results obtained by Morrey (loc. cit., Theorem 1, 
§1) on the assumption f C T- C. After Theorem 18 is established, his proof suffices for Corollary 2. 

§ See Burkill and Haslam-Jones, loc. cit., Lemma 2. 

|| Corollary 1 constitutes an extension of a similar result obtained at the expense of considerable 
trouble by Burkill and Haslam-Jones, loc. cit.: they assumed f to be in T- M and to satisfy a further 
measurability condition; i.e., the condition which in §7 we shall show is satisfied by all functions in H. 


| 
| 

} 
| 
iy 

4 

p=) 

2) 

4 

j 

oa 
A 

é 

au 

ch 

ag 


724 C. R. ADAMS AND J. A. CLARKSON [October 


lows. The existence of a bounded set which is not plane measurable and of 
which at most two points lie on any straight line has been proved by Sier- 
pitiski*; let E be such a set entirely contained in the rectangle (a, c; 6, d), 
where 0 <a<b<1,0<c<d<1. Then choose any four numbers a, d; to 
satisfy the inequalities b <a, <b, <1, d<c,<d, <1 and form the set E, by add- 
ing to E the following points: for each x,[y,] in the interval (0, 1), if the line 
x =2x;[y=y,] contains only one point of E, add the point (x, ¢:) [(a, y:)]; 
if this line contains no point of E, add both the points (x1, c:) [(a, y:)] and 
(a1, d:) [(b:, y:) ]. The characteristic function of E, is in T (as well as T) but 
not in M, and it fails to have any of the properties specified in Theorem 18 
and its corollaries. 

Remarks. Example (D) of CA shows that T-M contains bounded func- 
tions (satisfying in addition the measurability condition considered in §7) 
which are everywhere discontinuous in (x, y) and hence nowhere totally dif- 
ferentiable. It has been proved by Saksf that there exist functions nowhere 
totally differentiable which are not only in 7-C but satisfy considerably more 
stringent conditions. For f¢ H, W. H. Youngt has shown that the two cross 
partial derivatives of second order also exist almost everywhere. 


7. A PROPERTY OF CLASS H 


Let us set V.(xo, yo) =the total variation of f(x, yo) in x for a<xSxo, 
V , (xo, Yo) = the total variation of y) in y for cS y< yo; then we may for- 
mulate the 


DEFINITION. A function f(x, y) will be said to have the property M, when 
and only when V (x, y) and V ,(x, y) are both measurable functions of (x, y) in R. 


THEOREM 19. A function f(x, y) in class H has the property M,. 


We give a proof for V,(x, y). Let us assume that this function is non- 
measurable, and in particular that a is a number such that the set E[V,(zx, y) 
>a] is non-measurable. Clearly E consists of the points on a set of inverted 
ordinates 2 standing on (or hanging from) the top of the rectangle R; an ordi- 
nate may consist of a single point, or, if it contains more than one point, it 
miay or may not have a lowest point (i.e., be closed). 


* Sierpifiski, Sur un probléme concernant les ensembles mesurables superficiellement, Fundamenta 
Mathematicae, vol. 1 (1920), pp. 112-115. 

t Saks, On the surfaces without tangent planes, Annals of Mathematics, (2), vol. 34 (1933), pp. 
114-124. 

t W. H. Young, Sur la dériw.tion des fonctions a variation bornée, Comptes Rendus (Paris), 
vol. 164 (1917), pp. 622-625. 


= 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 725 


By Theorem 13 the discontinuities in y of f(x, y) lie on a denumerable 
number of lines y=4; let Z, designate this set of values 7. The feet of the 
ordinates 2 form a measurable set £2, since E, is identical with the set of 
points x for which the measurable* function ¢(x) is >a. Let the lengths of 
the ordinates 2 define a function g(x) over EZ. Since g(x) is <d—c, E can fail 
to be measurable only if the L-integral of g(x) over £; fails to exist}, and this 
can occur only if g(x) is a non-measurable function. Let 6( 20) be a number 
for which E; [g(x) >8] is non-measurable. There is no restriction in assum- 
ingf, as we now do, that d—8 does not belong to Z,. All ordinates Q of length 
8 will then be open. 

Let E, be the projection of £; on the line y =d—8, and let C(£;) represent 
its complement with respect to the interval a<x<b on this line. At each 

4 point of E, which is a limit point of C(E;), V,(x, d—8) is manifestly discon- 

d tinuous in x; these points constitute a set E;. Since C(£,) is non-measurable, 

Es must be likewise. Hence m,.E;, and therefore the exterior measure of the 

set of points at which V,(x, d—) has a discontinuity in x, is positive. On the 

other hand, by Theorem 12 above and Theorem 1 of CA, the discontinuities 
of V,(x, d—8) are denumerable. From this contradiction we infer the theo- 
rem. 

Remarks. Example (C) of CA is a function in A which does not have the 

property M,; hence Theorem 19 fails for A, P, and T. Examples of a function 
(either measurable or non-measurable) in T but without the property M, 
may readily be constructed. We think it probable that Theorem 19 fails for 
V and F, but an example to show this does not immediately suggest itself. 
It should be observed that M, is not an additive property, as is illustrated 
by the example following Theorem 2. Nevertheless, f being in V, V, for f is 
identical with V, for f, where f(x, y) =f(x, y) —f(x, 9) and 9 is any fixed value 
in the interval (c,d); and f(x, y) =f(x, y) +/(4, y), # being any fixed value in 

(a, b), can have no discontinuity in y where f(x, y) (¢ H) and f(2, y) are both 

; continuous in y. Therefore the above proof of Theorem 19 can be used to es- 

tablish the following assertion: if f(x, y) is in V and there exists an z[7] in 

(a, b) [(c, d)] for which f(z, y) [f(x, §)] is continuous almost everywhere, 

V(x, y)[V.(x, y)] is measurable. 


* See CA, Theorem 1. 
t See Carathéodory, Vorlesungen iiber Reelle Funktionen, Berlin, 1918, p. 419; and Schlesinger 
y and Plessner, loc. cit., p. 78. 

e t See Saks, Théorie de l’Intégrale, loc. cit., p. 37, where it is shown that measurability of the set 
E [e(x)>a] for every rational a is sufficient to insure measurability of g(x). It is clear that the proof 
remains valid if we assume E[g(x) >a] measurable for any set of values a which is everywhere dense. 


4 
of 
| 
| 
by 
| 
| 
| 
| 
] 
: 
; 


C. R. ADAMS AND J. A. CLARKSON [October 


8. THE EFFECT OF LIPSCHITZ CONDITIONS 
It is clear that the satisfaction of a Lipschitz condition, 
| f(x + Ax, y + Ay) — f(x, y)| S k(Ax? + Ay*)!/2 (& = constant), 


is sufficient to place f in class A-C (and therefore P-C and T-C). At the same 
time it is insufficient to put f in H, V, or F, as the following example shows. 
Divide the unit rectangle I into columns by the lines 1—1/2"(m=1, 2, - - - ); 
proceeding from left to right, divide the mth column into 2* squares. On 
each square define f by the height of a regular pyramid with that square as 
base and with altitude equal to a side of the square, and let f(1, y)=0. 

Fréchet* has observed that the satisfaction of a Lipschitz condition in 
terms of area, 


| Anf(x, y) | s k| Ax-Ay| (k = constant), 


is sufficient to insure that f be in V (and therefore F); that it does not suffice 
to put f in any of the other classes may readily be seen by examples. 


9. DEPENDENCE UPON AXES 


It is quite clear that a function in class V, H, A, or F may fail to remain 
in that class when the x, y axes are rotated through a suitably chosen angle. 
On the other hand, definition P may easily be proved independent of the 
axes, and T-C is manifestly independent of the axes because of its geometric 
significance.t The question for T (or T) is not so easily answered, and we 
shall construct an example to show that T (and T) is not independent of the 
axes.f 

Let E,[E,] be the set of numbers in the interval (0, 1) which have a 
triadic representation free from the digit 2[1], and define f(x, y) as the char- 
acteristic function of the set E of points (x, y) for which x is in EZ, and y is 
in E,. Since Z, and £, are Cantor sets of measure zero, we have fc T. It will 
be shown that f does not remain in T when the axes are rotated through the 
angle 7/4. 

The equation of the perpendicular to y=x at (x0, xo) is x+y=2x0, and 
it is apparent that for any x» in the interval (0, $) this line contains at least 


* Fréchet, Extension au cas des intégrales multiples d’une définition de Vintégrale due a Stieltjes, 
Nouvelles Annales de Mathématiques, (4), vol. 10 (1910), pp. 241-256. 

t That is, a necessary and sufficient condition that a continuous surface z=f(x, y) have area in 
the Lebesgue sense is f(x, y) T. 

t We are indebted to Dr. W. C. Randels for suggesting this example. It is probable that our 
purpose would also be served by the characteristic function of some of the sets constructed by 
Mazurkiewicz and Saks, Sur les projections d’un ensemble fermé, Fundamenta Mathematicae, vol. 8 
(1926), pp. 109-113, but the example given here seems somewhat easier to discuss. 


726 
3 
$ 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 727 


one point of EZ. For, 2x) being given in triadic form, corresponding digits in 
x, and y, such that 2;+y;=2x5 can be chosen as follows: in x,[y,] put a 0 
wherever a 0 or 2[1] occurs in 2x, and put a 1[2] wherever a 1[2] occurs in 
2x9. Such a choice of digits for x, y, may be said to be “according to rule.” 
Let ¢:(%) be the ¢-function for the new x-axis (i.e., the line y=x in the 
original coordinate system). 

Consider first any 2x, of the form .10 - - - , the remaining digits being 
arbitrary. The points (xo, xo) corresponding to these numbers fill an interval 
I, on y.=x of length 1/(322"2). To each such number we have 


= .10-:- 
(11) { 1 ’ { 1 
= .00---, n=. 
the remaining digits in all cases being chosen according to rule. We then have 


> 2/(3?2"/2), 

Next consider 2x,=.ab10 - - - , the subsequent digits being arbitrary and 
a, b anything except 1, 0. The points (xo, x) corresponding to these numbers 
fill 32—1 intervals each of length 1/(342"/?); this set of intervals we may call 
I,. To each number 2x, of the present form we may choose the third and 
fourth digits as the first two were chosen in (11) and choose the rest according 
to rule. We obtain 


> 2(3% — 1)/(3421/2). 
Ts 


Continuing in this manner we find 
2 
dates 1/2 


Repeating this process using blocks of 2p digits 1010 - - - 10, to each of 
which there correspond 2? choices instead of the two in (11), we obtain 


27 327m 
fas 
Qu2 34 
Since p is arbitrary, /¢, does not exist and our assertion is proved. 
10. FACTORABLE FUNCTIONS BELONGING TO THE SEVERAL CLASSES 


For our present purposes a function f(x, y) will be called factorable if 
and only if we have in R 


f(x, y) = g(x)h(y), 


a 
¥ 
. 

3 4 
g 

f 
4 
ited 
| 
3 

= 

2 

iy] 

* 
La 
3 


728 C. R. ADAMS AND J. A. CLARKSON [October 


with neither g nor / identically zero.* The verification of the following equa- 
tions is then immediate: 


Au f(x, y) =A g(x) Ah(y), 


and for each net 


max > f(x, = max| | 
Aka)! 
Auf(xs, - 


Conclusions may be drawn as follows. 


THEOREM 20. A necessary and sufficient condition that a factorable function 
be in class H is that each factor be of bounded variation. A factorable function, 
with one factor of unbounded variation and the other a constant, is in V and F 
but not in A, P, T, or T. A factorable function, with one factor of unbounded 
variation and the other not a constant, is not in V, F, or A; it is not in P, T, 
or T unless the latter factor vanishes almost everywhere, and even then it may 
not be. 


CoROLLARY. Class A contains no factorable functions save those in H ; F con- 
tains no factorable functions save those in V ; but each of the classes V, P, and T 
contains factorable functions which are not in H. A factorable function in T 
but not in H must vanish almost everywhere in R. 


11. THE “VARIATION” OF FUNCTIONS BELONGING TO THE SEVERAL CLASSES 


It is our object here to direct attention to two things: (i) the fact that a 
function belongs to one of the several classes conveys, in most cases, compara- 
tively little idea of the extent to which the function fluctuates in R; and 
(ii) the difficulty of associating with a function belonging to any one of the 
several classes, by means of the definition of that class, a number which con- 
veys any precise notion of the amount of fluctuation of the functional values. 

Let us first consider the classes V, F, and T (or T). It has already been 
remarked in CA that the V-sum and the maximum F-sum for a given net V 
are never decreased when new horizontal or vertical lines are added to form 
a net NV’. Therefore it might be considered natural to define the total varia- 
tion of a function in V or F as the least upper bound of the respective V- 
or F-sum. For a function in T the quantities 


* In the contrary case f is obviously in H. 


a 
] 
| 
| 


1934] FUNCTIONS f(x, y) OF BOUNDED VARIATION 729 


1 1 
(12) —J o(x)dx, ¥(y)dy 


are the average total variations respectively of f(#, y) in y and f(x, §) in x. 
One might therefore consider it desirable to define the total variation of a 
function in T as the larger of the numbers (12), or perhaps some linear com- 
bination of them. Under such definitions each of the classes V, F, and T con- 
tains* functions with an arbitrarily large saltus at every point in R whose total 
variation is zero! The reader inclined to be critical of our point of view may 
aver that it should not matter much what values a function f(x, y) has on a 
set of plane measure zero, and it is true that a function in T whose total 
variation is zero according to the definition suggested above is “almost a con- 
stant”; nevertheless we are inclined to insist that when the total variation of 
g(x) is in question, it matters a great deal what values g(x) assumes on a set 
of linear measure zero. 

If when f is in H one were to define the total variation as the least upper 
bound of the V-sum, very little notion of the amount of fluctuation would be 
conveyed; for every function f(x, y)=g(x), where g(x) is of bounded varia- 
tion, would have total variation zero as a function of two variables, indepen- 
dently of the value of the total variation of g(x) and although in general 
f(x, y) is not even “almost a constant”. 

If a function is in A, it would be natural to define its total variation as the 
least upper bound of the A-sum. This procedure, however, has several dis- 
advantages, including the fact that the total variation of a function in R 
would not in general be the sum of its total variations in the two rectangles 
into which R is divided by a vertical (or horizontal) line. 

It is quite clear that except when f(x, y) is a constant, the total variation 
of a function in P, defined as the least upper bound of the P-sum, would de- 
pend upon the value of the fixed upper bound for D, the side of the square 
cells employed. One would naturally turn, therefore, to the Py-form of the 
definition. Since the Py-sum may decrease as m increases, it might be prefer- 
able to define the total variation, not as the least upper bound of the Py-sum, 
but as the lim,.., of this sum. Whichever choice were made, the definition 
would be open to the objection that the total variation of such a function as 


fa, 9) bint 
= in 
0 for x 


would be different for # rational and for z irrational. This objection can only 


* This is clearly indicated by examples given above in §6 and example (D) of CA. 


‘ 
4 
¥; 
J 
4 
3 
4 
‘4 > 
4 
=f 
| 
; 
3 
> 
& 
j 
+ 
x 
3 
‘al 
: 
xt 
4 > 


730 C. R. ADAMS AND J. A. CLARKSON 


be met by insisting that the oscillations w,’ in the n? cells be computed for 
cells so defined that no two have points in common; yet if this were done, the 
total variation in R would not in general be equal to the sum of the total 
variations in the two rectangles into which R is divided by a vertical (or 
horizontal) line. 

For the reasons described above it would seem desirable to regard the several 
definitions of bounded variation for functions of two variables purely as formal 
generalizations of analytic conditions in common use in the theory of functions 
of a single variable or as conditions which single out for consideration some class 
of functions having one or more properties like those of a function g(x) of bounded 
variation,* and rather completely to disassociate the term “function of bounded 
variation” from any notion of the amount which the function f(x, y) fluctuates 
in the rectangle R. 


* See certain remarks in CA, pp. 826-827. 


BROWN UNIVERSITY, 
PROVIDENCE, R. I. 


A NEW METHOD FOR WARING THEOREMS 
WITH POLYNOMIAL SUMMANDS* 


BY 
L. E. DICKSON 


1. Part I of this paper is self-contained and presupposes only the rudi- 
ments of elementary theory of numbers. The method employs a pair of poly- 
nomials p(x) and g(x) of degree m, each uniquely determined by the other, 
such that there exists an identity which expresses Jg(s) as a sum of m values 
of p(«*), where J is an integer and s is a sum of four squares. Then a Waring 
theorem for g(x) yields one for p(x?). For, if every (large) integer is a sum of v 
values of g(x), then every (large) multiple of 7 is a sum of vm values of p(x?). 
From the last result we readily find how many values of p(x?) suffice for all 
integers. 

Apart from the special case in which g(x) is a power of x, there is no hint 
in the literature of this instantaneous deduction of a Waring theorem for an 
even polynomial of degree 2m from a known Waring theorem for a poly- 
nomial of degree m. On the contrary, Maillet resorted to an extensive proof 
for the case m =2. 

We feel justified in perfecting the theory of sums of four values of a quad- 
ratic function g(x) in view of the resulting theorems for certain polynomials of 
degrees 4, 8, etc. 

Since we seek Waring theorems holding for all positive integers (or with 
all exceptions listed), we are not content with theorems holding for all suffi- 
ciently large integers and certainly not with the asymptotic results much in 
vogue. 


Part I. WARING THEOREMS FOR POLYNOMIALS OF DEGREES 2 AND 4 
2. Using the abbreviation s = a?+?+c?+d?, we have the identities 
(1) 6s = + b)*, 6s? = D(a + 5)4, 
12 12 


in which the summands are the powers of 

(2) atb, atc, atd, bic, bid, ctd. 
Write 

(3) f(x) = uxt + vx, g(x) = ux? + ox. 


* Presented to the Society, April 7, 1934; received by the editors June 29, 1934. 
731 


; 
off 
a 
4 
| 
4 
| 
By 
‘ 
¢ 1 
t 
a 
i 
= 
, 
# 


732 L. E. DICKSON 


Hence we have the following identity* in a, b, c, d, u, v: 


(4) 6q(s) = Dof(a + 


First, take u=v=1/2. Then g(x) =x(x+1)/2 is called a triangular num- 
ber. It is knownf that every positive integer m is a sum of three triangular 
numbers g(s;) with s; 20. But each such integer s; is a sum of four squares. 
Hence (4) used three times shows that 6” is a sum of 36 values of f(x) = 
(x*+*)/2 for integers x. Every positive integer p is of the form 6n-++r, 
0<r<5, while r=rf(1), whence p is a sum of 41 values of f(x). 

We can reduce 41 to 38 as follows. The numbers f(0) =0, f(1) =1, f(2) =10, 
J(3) =45 are congruent modulo 6 to 0, 1, 4, 3. Also, 2f(1) =2, f(1) +/(2) =5 
(mod 6). Hence if M is any integer, we can find two integers a, b, each 20, 
such that 


f(a) + f(b) = M (mod 6), f(a) + f(b) = 45. 


Thus if M245, M—f(a) —f(0) is a multiple =0 of 6 and hence is a sum of 36 
values of f(x), whence M is a sum of 38 such values. But 10x+~y=2f(2) +yf(1) 
is a sum of x+y values of f(x), whence every integer <100 is a sum of fewer 
than 20 values. 


THEOREM 1. Every positive integer is a sum of 38 values of (x*+x*)/2. 


3. Second, take u=1/2, v=—1/2. Then g(x)=(x?—x)/2 becomes 
(y?+y)/2 for y=x—1. We may discard the negative value —1 of y corre- 
sponding to x=0 since g(x) =0 also when x=1, which corresponds to y=0. 
The fact that every integer V =0 is a sum of three triangular numbers there- 
fore implies that NV is a sumf of three values of g(x) for integers x >0. Thus (4) 
implies that every positive multiple of 6 is a sum of 36 values of f(x) = (xt —x?)/2 
for integers x. Conversely, any sum of values of f(x) is a multiple of 6. In fact, 
if x is any integer, «*—x? is a multiple of 4 and of 3. 


THEOREM 2. Every positive integer is a sum of 36 (always positive integral) 
values of (x*—x*)/12. 


4. Maillet§ investigated positive integers A which are sums of four values 
of 


* Also when we add ¢ to f and 2c to g. The modifications of the later theory are evident. 
t Since 8x+3 is a sum of three squares, each necessarily an odd square (2x+1)?, where x20. 
Hence n= q(x). 

t This g(x) and 4x(x+1) are the only functions g(x) =ux?+»x such that every positive integer 
is a sum of three values of g(x) and such that g(x) is an integer [0 for every integer x20. 

§ Bulletin de la Société Mathématique de France, vol. 23 (1895), pp. 40-49. He did not find the 
actual limit (12) for A. 


[October 
12 

3 


WARING THEOREMS WITH POLYNOMIAL SUMMANDS 


(5) q(x) = (mx? + nx)/2, m>0. 
If we can find positive odd integers k and ¢ such that 
(6) (3k — < (4k)*2, 
(7) A = mk/2 +,nt/2, 


then by Cauchy’s lemma there exist four integers 20 whose sum is ¢ and the 
sum of whose squares is k, whence by (5) and (7), A is a sum of four values of 
q(x) for integers x =>0. Write 


(8) r? = n? + 2Am, w? = 24Am + 12mn — 8m? + 9n?. 


Let A2=m/2, whence w?=(2m+3n)*. Elimination of & between (6) and (7) 
gives 


2m m 

Let m and n be relatively prime odd integers. If G>Z+2m, there are m 
consecutive positive odd integers 4, ---, tm between Z and G in (9). If 
2A —nt;=2A —nt; (mod m), then ¢; and ¢; are congruent modulo m and hence 
modulo 2m. Since their difference is numerically <2m, they are equal. Hence 
there is a positive odd integer ¢ satisfying both (9) and 2A —nt=0 (mod m). 
Then k =(2A —nt)/m is an odd integer satisfying (7). This & is positive since 
2A =nG(>nt), which follows by eliminating A by (8,) and using (r—)?=0. 

Conversely, when & and ¢ are positive integers, (7) and (9) imply (6) if 
mt+2n =0 (trivial if n=0) and hence if mZ+2n=0, viz., w22m—n. The 
latter follows from its square and hence from 


(9’) 6mA = 3m? — 4mn — 2n? (if n < 0). 

The condition G—L >2m is equivalent to 
(10) 4r>wt+T, T=n— 2m+ 4m’. 
This follows from its square: 16r7—w?—T?>2wT. The latter follows from 
its square (11) if its left member is =>0, which is true when 
(10’) 8Am = T? — 7n? + 12mn — 8m’. 

Write w?=24A4m+dH. Then 

(8Am + J)? = (16r? — w? — T?)? > (2w7)? = 477(24Am + B), 


11 
a) H = 12mn — 8m? + 9n?, J = 8m? — 12mn + 7n? — T?. 


In the inequality transpose the terms with A to the left and complete the 
square on A. Hence 


ig 
1934] 733 
q 
| | 
4 
4 
a 
Au] 
> 
a 
if 
& 
‘ 
4 
‘ 
j 
A 


734 L. E. DICKSON 

(12) 8Am > 6T?-J +27P, P? = H — 3) + 97%, 

(12’) P? = 16m*(1 + 6n — 12m + 12m?). 
Conversely, if P=0, T=0, (12) implies (11). 


THEOREM 3. If m and n are relatively prime odd integers and m>0, P20, 
T =0, every integer A =}m, large enough to satisfy (12), (9’) and (10’), is a sum 
of four values of (5) for integers x=0. 

5. Hence (4) used four times shows that 6A is a sum of 48 values of 
f(x) = (mx*+nx*)/2. Let D denote the g. c. d. of 6 and f(1). Let g be any given 
multiple of D. Then f(1)y+6z=g is solvable in integers y, z, with O<yS5. 
Thus 6N +g =6(N +2) +yf(1). Hence every sufficiently large multiple of D is a 
sum of 48+y<53 values of f(x). This is equivalent to the more complicated 
Theorem V of Maillet, who proved it by an extensive argument based on 
Cauchy’s lemma. 

It is a new result that we may replace 53 by 50. Write 


(13) m=2M+1, n=2N —1, 
where M and N are integers. Then 
(0) = 0, f(1) = M+-N, f(2) = 4(M + N), f(3) = 3(M +N) (mod 6). 


These with 2f(1) and f(1)+/(2) evidently form a complete set of residues 
modulo 6 if M-+-N is prime to 6. Let G be the largest of the six numbers just 
used. Let J be any integer =>G. Hence there exist integers a, b, each 20, 
such that f(a)+/(b) is <G and is =J (mod 6). Thus J —f(a) —f(d) is a posi- 
tive multiple 6A of 6. If A is large enough to satisfy (12), 6A is a sum of 48 
values of f(x). 

Next, if +N is even and prime to 3, we see that f(0), f(1) and 2f(1) are 
congruent modulo 6 to 0, 2, 4 in some order. Thus all large even integers are 
sums of 50 values of f(x), which is always even. 

Next, if M+N is an odd multiple of 3, f(1) =3 (mod 6). If J is a multiple 
of 3, one of J, J—f(1) is a multiple of 6. 


THEOREM 4. Let m and n be relatively prime odd integers, m>0. Let D de- 
note the g.c.d. of 6 and (m+n)/2. Then every sufficiently large multiple I of D 
is a sum of 50 values of f(x) =(mx*+nx?)/2. We may replace 50 by 49 if D=3, 
and by 48 if D=6. The theorem holds if 1=>G+6A, where A is large enough to 
satisfy (12), (9’) and (10’), while G is the largest of 2f(1), f(1) +f(2) and f(3) 
if D=1;G=2f(1) if D=2; G=f(1) if D=3; G=0 if D=6. An equivalent state- 
ment is that every integer =(G+6A)/D is a sum of t values of (mx*+nx*)/(2D), 
which is always integral, where t=50 if D=1 or 2, t=49 if D=3, t=48 if D=6. 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 735 


6. For the case of polygonal numbers of order m+2, we have n=2—m., 
Then 0<P <4-3"/2m(2m—1) if m=2, and (12) is seen to hold if A =28m* 
(first proved by Legendre). Also (9’) holds if A =>5m/6, and (10’) if A =2m'. 
But (12) holds for smaller values of A. For example, if m=3, (12) holds if 
A =478, whereas 28m’ = 756. By the writer’s* table of sums of four polygonal 
numbers, the case m=3 shows that every integer <480, except the six in 
Theorem 5, is a sum of four pentagonal numbers. 


THEOREM 5. Every integer except 9, 21, 31, 43, 55, and 89,is a sum of four 
pentagonal numbers (3x*—x)/2. Hence every integer is a sum of five, one of 
which is 0 or 1. 


CorROLiarRY 1. Except when N is one of those six numbers, 24N +4 is a sum 
of four squares of integers 6x—1 with x=0. 


CoroLiary 2. If v is a fixed one of the numbers 2-4, 6-9, every positive in- 
teger is a sum of four integers each of which is v or pentagonal. 


For, each of the six exceptions in Theorem 5 is such a sum. 
7. In this section we prove two lemmas. 


Lemna 1. Let f(z) be an integer =0 for every integer 2=0. Let g(x) denote the 
greatest integer <f(x+1)/f(x). Then every positive integer I <f(x+1) exceeds a 
sum of at most g(x) +g(x—1)+ --- +g(1) values of f(z) by an integer which 
is =O and <f(1). 


For, J =C(x)f(x)+r(x), where C(x) <g(x), 0<r(x) <f(x), C and r being 
integers. Thus every r(x) is expressible as C(«—1)f(x—1)+r(x—1). Repeti- 
tions show that 


I = C(x)f(~) + — 1)f(# 1) +---+ + 4, 
where 0<u<(f(1). 
Lemma 2. Define f(z) and g=g(2) as in Lemma 1. Let f(0) =0, f(1) =1, and 


f(z+1) >f(z). Then I=gf(2)—1 is a sum of g—2+-f(2), but not fewer, values 
of f(z) for integers z=0. 


Since gf(2) <f(3), I <f(3) and the only decompositions of J are of the form 
rf(2)+s with r<g. When r=g—1, then s=f(2)—1 and we see that J is a 
sum of g—1+/(2) —1 =v values of f(z). When r=g—k, k=2, then s=&f(2) —1 
and the decomposition involves »+( —1) [f(2) —1]>v values of f(z). 


* Bulletin of the American Mathematical Society, vol. 33 (1927), p. 718. The present result was 
also proved directly. 


4 
4 
7 
« 
| 
4 
> 
q 
= 
hh 
ae 


736 L. E. DICKSON [October 


It is a reasonable conjecture that every positive integer is a sum of J 
values of f(z). 
8. We now prove three theorems. 


THEOREM 6. Every positive integer is a sum of 50 values of f(x) = (3x4 —x?)/2. 


Here f(1)=1, f(2)=22, f(3)=117, f(4) =376, f(5)=925. Since m=3, 
n=-—1, (5) is pentagonal, whence A =89 by Theorem 5. In Theorem 4, 
G=f(3), whence every integer >G+6A =651 is a sum of 50 values of f(x). 
By Lemma 1 every integer </(4) is a sum of 3+5+22 values of f(z). Hence 
everyone <2f(4) =752 is a sum of 31 values. 


THEOREM 7. Every positive integer is a sum of 50 values of f(x) =(5x* 
—3x?)/2. 


Here m=5, n= —3 and (12) holds if A => 2613 (whereas Legendre’s limit 
is 3500), and then (9’) and (10’) are satisfied. The successive values of f(x) 
are 1, 34, 189, 616, 1525, 3186, 5929, 10144, 16281; those of g(x) are 34, 5, 3, 2, 
2,1, 1, 1. By Lemma 1, every integer </(9) is a sum of 49 values of f(x). In 
Theorem 4, G=/(3), whence every integer = 189+6(2613) = 15867 is a sum 
of 50 values. 


THEOREM 8. If every positive integer is a sum of not more than 50 values of 
f(x) =(mx*+nx?)/2, then f(x) is the quartic in Theorems 1, 6, or 7, or else is 


(14) f(x) = (7x4 — Sx?)/2. 


Since f(y) shall represent 1 for an integer y=0, y? must divide 2. Hence 
y=1 and (m+n)/2=1. Employ (13). Then V=1-—M, 


f(x) = + M)xt + — MZ. 


If M =O, 1, or 2, we have the functions in Theorems 1, 6, or 7. Hence let 
M = 3. Since f(1) =1, f(2) =10+12M, f(3) =45+72M, the number 9+12M is 
not a sum of fewer than 9+12M values of f(x). But 9+12M <S0 only if 
M <3. Thus M =3 and we get (14). Then f(1) =1, f(2) =46, f(3) =267. By 
Lemma 2, 229=4-46+45 is a sum of 49, but not fewer, values of f(x). It 
was readily verified that 49 values suffice to f(4) =856, but no further ex- 
amination has been made. 

One of my students is treating the many universal theorems obtained 
when D=2, 3 or 6. 

9. Finally, we take m = 2u, n = 2v, u and 7 relatively prime. We shall choose 
A so large that there are m/2 consecutive positive odd integers ¢, between L 
and G in (9). If 2A —nt;=2.4 —nt, (mod m), then t;=¢; (mod m/2 =u). 

First, let « be odd. Then ¢;=/; (mod m=2u). But the difference between 


‘a 
& 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 737 


i, and ¢; is numerically <m. Hence they are equal. Since the m/2 even in- 
tegers 2A —nt; are incongruent modulo m, one of them is congruent to zero. 
Thus 


2A —nt A— wt 
k= = 
m u 


is an integer. It is odd* if A is odd and 7 is even. 

Second, let « be even, v and A both odd. Use only the first m/4 of our 
integers ¢,. The difference between any two of them is numerically <2(m/4 
—1)<m/2. Hence we have m/4 multiples 2A —nt; of 4 which are incongruent 
modulo m. Thus one of them is =0. We employ the resulting & (corresponding 
to t) if k is odd. But if & is even, k’ (corresponding to ¢’=t+m/2) is k—n/2, 
which is odd, and now we have used the last m/4 of our ¢,. 

This amplification of Maillet’s argument shows that, if « and v are rela- 
tively prime, and if «+2 is odd, every sufficiently large odd integer A is a 
sum of four values of g(x) in (3). 

There will be m/2 odd integers between L and G if G—L>™m, and hence 
by (9) if (10) holds with T replaced by 


(15) T, = n — 2m + 2m?. 
Then (11) and (12) hold with T and P replaced by 7; and P;, where 
(16) P? = 16m?(1 + 3n — 6m + 3m?). 


THEOREM 9. Let u and v be relatively prime, and u+v be odd. Let P,;=0, 
T,20. Then A is a sum of four values of q(x) in (3) for integers x =0 if A=}m, 
A is odd and large enough to satisfy (9’), (10’) and (12) with subscripts 1 on T 
and P, and with the abbreviations (11), (15), (16). 


10. Then (4) used four times shows that 6A =6(2N+1) is a sum of 48 
values of f(x) in (3). The g.c.d. of f(1) =u+v and 12 is 6=1 or 3. If g is any 
positive multiple of 6, f(1)y+12z=g is solvable in integers y, z withO<y<11. 
As shown by elimination of g, 64+g is a sum of 48+y<59 values of f(x). 
Hence every large multiple of 6 is a sum of 59 values of f(x). 

But we may reduce Maillet’s 59 to 51. First, let ~+v be prime to 3 (as 
well as to 2). We employ 


f(0) = 0, fl) = u +9, f(2) = 4(u + 2), f(3) = 9(u + 2) (mod 12). 


Their sums by two are congruent to the products of u+v by 0-2, 4-6, 8-10, 


* Also if A is even and v odd. But the resulting theorem is a mere corollary to Theorem 3, with m, 
n replaced by u, v, as seen by doubling the numbers and function. 


> 
it 
> 
ony: 
Ay 
a 
4 
qj 
4 
| 
fi 
i : 
4 
y 
: 
¥. h 


738 L. E. DICKSON [October 


whence their sums by three give a complete set of residues modulo 12. Next, 
if u+v is divisible by 3, we see that f(0), f(1), 2f(1), 3f(1) are congruent mod- 
ulo 12 to 0, 3, 6, 9 in some order. 


THEOREM 10. Let u,v be relatively prime, u>0, and u+v be odd. The g.c.d. 
of 12 and u+v is 6=1 or 3. If 6=1, let G be the greatest of 3f(1), f(1) +f(2), 
2f(2), f(3) +2f(1) and 2f(3)+f(1). But if 6=3, let G=3f(1). Let A be odd and 
large enough to satisfy (9’), (10’) and (12) in the sense of Theorem 9. Then every 
integer =(6A+G)/6 is a sum of 51 values of Q(x) =(ux*+vx*)/6. 

11. We now prove 

THEOREM 11. Employ the assumptions and notations of Theorem 10. If 
every positive integer is a sum of 51 values of Q(x), then ut+v=6, u<46. Either 
5=1 and uS3, or 5=3 and u is one of the integers* 1, 2, 4, 5, 7, 8, 10, 11 prime 
to 3. Conversely, every integer is a sum of 51 values at least when Q(x) =2x* —x? 
or (x*+2x?) /3. 


Since 1=Q(y) for an integer y>0, y? divides 5=1 or 3. Hence y=1, 
u+v=6. Also Q(1) =1, 0(2) =(12~+46)/5, >Q(2). Hence the only de- 
composition of Q(2)—1 into a sum of values of Q(x) is that in which each 
x=1. Thus Q(2)—1<51, whence u<46. But if 6=1, w=4, then Q(2) =52, 
Q(3) =552+37, and Lemma 2 with g(2) =5 shows that 259 is a sum of 55, 


but not fewer, values. 

If u=2, v= —1, we find by Theorem 9 that if A is odd and =195, Aisa 
sum of four values of g(x) =2x?—x. The successive values of Q(x) =2x*—<x? 
are 1, 28, 153, 496, 1225, 2556, whence those of g(x) in Lemma 1 are 28, 5, 3, 
2, 2, whence 40 values suffice to F(6) = 2556, which exceeds 6A +G=6X195 
+307 = 1477. 

Let u=1, v=2. By Theorem 9, we find that every odd integer A =55 is a 
sum of four values of g(x) = x*?+2x. The successive values of Q(x) = (x*+2x?)/3 
are 1, 8, 33, 96, 225. Those of g(x) in Lemma 1 are 8, 4, 2, 2, whence every 
positive integer <Q(5) =225 is a sum of 16 values of Q(x). But in Theorem 10, 
G=67, whence all integers = 133 =>(67+330)/3 are sums of 51 values. 

It is readily verified that 1, 2, 4, 5, 7, 10, 13, 20, 25, 28, 37, 52 are the 
only integers <56 which are not sums of four values of x?+-2x. Hence every 
odd integer J except 1, 5, 7, 13, 25, 37 is a sum of four values with x20. 
Write y=x+1. Then J+4 is a sum of four values of y?, y21. 


CoROLLARY. Every positive odd integer except 5, 9, 11, 17, 29, 41 is a sum 
of four squares each #0. 


There is no such result in the literature. 


* If «=11, Lemma 2 shows that 239 is a sum of 51, but not fewer, values. 


y! 
it 


WARING THEOREMS WITH POLYNOMIAL SUMMANDS 


Part II. WARING THEOREMS FOR CUBIC FUNCTIONS 
12. In these Transactions, 1934, pp. 1-12, I discussed 
(17) f(x) = x + de(x*? — x) (e an integer > 0) 


which is an integer 20 for every integer x=0, while f(x) =1. For € prime to 
3, I found positive integers C and vy such that every integer =>C-3* is a sum 
of nine values of f(x) for integers x=0. 
Call f(x) universal if every positive integer is a sum of nine values of f(x) 
for integers x =0. Since f(2) =2+¢, f(3) =3+4e, 10 is not a sum of 9 values if 
«=9. Lemma 2 shows that if e=8, 29 is not a sum of fewer than 11 values; 
while if e=7, 26 is not a sum of fewer than 10 values. 
a If e=6, f(x) =x? is known to be universal. If ¢=2, f(x) is universal (loc. 
‘ cit.). That f(x) is universal if e=3 was proved by Frances Baker in her Chi- 
cago dissertation (photo-printed). It has since been verified that f(x) is uni- 
versal* if e=1, 4, 5. 


THEOREM 12. Every positive integer is a sum of nine values of (17) for in- 
tegers x >0 if and only ife=1, - - - , 6. 


13. The result quoted at the beginning of §12 is a special case of 


THEOREM 13. If € and o are relativelyt prime positive integers and if € is 
prime to 3, there exist positive integers C and v such that every integer =C -3* is 
a sum of nine values of 


(18) f(x) = ox + de(x? — x). 


Lemma 3. Let s denote the least integer =0 for which 3°20. If n=s+1 and 
m<e-3", then f(3m) <y-3", where y=(9e4+1)/2. If and o it 
5 suffices to take n=8. 


The condition f(3e-3") <y- 3%" is equivalent to 


(19) 6ce — S 32", 90? S (€ — 30)? + 32", 


which holds if and hence if n—12s. 
The proof of Theorem 13 differs only in minor details from that for the 
case ¢ = 1 in these Transactions, 1934, pp. 3-12, a formula there numbered (j) 


* If e=1, there is no gap =>2 in a table of sums $2000 of four values, whence all positive integers 
< 2000 are sums of five. If e=4, all positive integers 72000 except 17, 35, 55, 61, 73, 79, 200, 206, 
‘i 213, 225 are sums of six values, whence every J is a sum of seven. If e=5, the exceptions to sums by 
2 seven are 20 and 360. 

+ If they had a common factor g>1, g would divide every number represented by f(x) and hence 
divide any sum of its values. 


1934] 739 
| 
| 
| 
| 4 
| 
| 
| 
| a 
ey 
| 
q 
= 


740 L. E. DICKSON [October 


being now cited as [j]. In [10] replace the term 3r by 3or. In the identity 
below [12] replace the term 2/ by 2cl. In T, [13], [16] and [17], replace the 
term 6 by 6c. In [15] and the identity above it, replace the term 1 by oa. 
In [20] and S; above it, and in [29], replace the term —6 by —6e, and the 
first term $1 by ¥o. In Q; in [26] replace the term 6 by 6c. 

When e=1, the three inequalities [26] are satisfied if 6, =5, b:=7, bs=11, 

‘=171, n=8, o<5-3'5/2. We see that §5 holds, with the term —60; re- 
placed by —6¢6, in [30]. Hence every integer is a sum of nine values of (18) 
if e=1, ¢<3%/2, y=8, C=171, and also for larger values of o when + is in- 
creased. 

The computations of },, be, b;, nm, C depended only on inequalities [26], 
the only present change in which occurs in §;. Hence by choice of m as a func- 
tion of o, the former values* of the 6; and C apply also here. 

Near the bottom of page 8, loc. cit., we now have A=3re, E=3ay; (mod e). 
Since ¢ is now prime to 6 and a, we can choose integers y; so that 


(20) M! — 3cy; — 60 — 30b; = 0 (mode), OS yi <e. 


Since B;=3eb; (mod e), we see as on page 9 that [16] determines Q; as an 
integer =1 (mod 4). This proves Theorem 13 when ¢ is prime to 6. 

In its proof (§7) when ¢ is even, we have merely to multiply the terms 
+3b; and —6 bya. 

14. We next prove Theorem 13 when e =3a, o#2a (mod 3). Then 


(21) D= — f(z) = or + + + — r) 


is divisible by 3* if and only if r is. Hence there exists an integer m such that 
any given integer is congruent to f(m) modulo 3*. A slight modification of 
the proof of Lemma 3 yields 


Lemma 4. Let s be the least integer =0 for which 3*20. If n=s—1 and 
O<m<a-3"*!, then for (18) with e=3a, f(m) <y-3*", where y =27(a*+1)/2. 


We again employ the formulas in these Transactions with factors o in- 
serted at the places mentioned in §13, and with f(3m) replaced by f(m). The 
essential point is that Q; is an integer. To prove this, apply the result above 
Lemma 4 with k=n+1. Hence if s; is any given integer, 


Si = f(ti) + (0 S < 


In D take z=1;, r=3"*'y,, where y; is an arbitrary integer. Then D=3**+'£, 


* Just as we now take C=17! instead of the former C=168 when e=1, so also for any ¢ a lower 
value of m may be secured by increasing the old C somewhat. 


: 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 741 


where E is an integer, and E=oy; (mod a). Denote u;—E by q; and ¢;+3"*'y, 
by m;. Then 


f(m;) — = D = = f(m;) + 


Thus M;=3q; is the number used in the general theory. Since ¢ is prime to a, 
there is a unique integer y,; such that 


22) — cy; — 206 — bso = 0 (moda), OS <a. 


Since V;=M,, B;=3b,0 (mod e), we see that NV; —60 —B;=0 (mod e), so that 
Q, is an integer. As in these Transactions, vol. 36, page 10, we may take 
(mod 4). 

We find the following values of b;, C. Those in II and III were obtained 
by G. C. Webber. 

I. a=3p4+2. b:=1464+7, b2=20f+11, b6:=30p+15. If p=0, 30765C 
< 3089. If p=1, C is between 


= 304934 p* + 66136} p? + 53480} p? + 191308p + 255674 


and* A+B, A =36000p*, B =80262p*+ 64827 p?+ 2311553 +3089. 
II. a=3p+1. b:=14p4+5, po=20P4+7, p3=30P+11. If p=0, 503<C 
<514. If p=1, Cis between 
= 30496} p4 + 436993? + 234693 p2 + 56014p + 50232, 
4S2 = 36000p4 + 49800p* + 25830p2 + 595436 + 514% - 
III. a=3p. If p=1, b:=9, bg =13, b3=19, 8507 <C $9844. If p22, C is 
between 
41s = 304963 p4 + 577123 + 365511p? + 77188p + 14, 
15, = 36000p* + 70200p* + 456302 + 98863). 
15. Hilbert? proved that a polynomial in x of degree d with rational coeffi- 
cients has an integral value for every integer x exceeding a fixed limit if and 
only if it is a linear function with integral coefficients of the binomial coeffi- 


cients (7) for s=1, - - - ,d. Replacing x by x+1, we see that every such cubic 
polynomial is the sum of 


(23) P(x) = A(x? — x)/6+ B(x? — x)/2+Cx (A, B,C integers) 


and an integer, which we may take to be zero in a Waring problem. As in the 
footnote to Theorem 13, we may assume 


(24) A, B, C have no common factor; A > 0. 


* S:=37044 p*+B, while A is the leading term of }S2. 
+t Mathematische Annalen, vol. 36 (1890), p. 511. 


4 
q 
4 
# 

| 
q 

q 
Ep 

4 

3 

By 


742 L. E. DICKSON [October 


R. D. James* has proved that every integer exceeding a certain L(A, B, C) 
is a sum of nine values of P(x) if A#4C (mod 8). This function Z was not 
determined, but is excessively large. Unlike our results in §§12-14, James’s 
result is essentially only an asymptotic theorem and yields only asymptotic 
results for sextics (Part III). 

Part IIT. WARING THEOREMS FOR POLYNOMIALS OF DEGREE 6 


16. We list some identities of degree 4 needed later. Write 
(25) r=at+b4+ct+d4, t= r+ = s?, 
6 


where s=a?+0?+c?+d?. Then 


(26) + Bb + + 1242(4B2 — A*)r = 24(2A2B? + B4)s?. 


12 


For A =2B or A =B, this becomes 


(27) dYi(2a + b + c)* = 216s, + c)4 + 127 = 24s?. 
48 16 


For A =0, (26) becomes (12). Next 


(28) + Bb)* — 6(A2 — = 
24 
where the coefficient of r is =>0 only when A?= B?, and then (28) becomes (1s). 
Again, 
(29) = 2452, 
8 
(30) + b+ + d)* + 88r = 240s”. 


32 


Except for (12) and (29), every such identity involves more than 32 fourth 
powers. 
17. We employ symmetric functions of a, b, c, d of degree 6: 


(31) i= j = k = 
4 12 a 


(32) tbtct = 8i + 8-157 + 8-6-15k, 


(33) + hb)* = + + 30wj, = gth? + ght. 
24 


* American Journal of Mathematics, vol. 56 (1934), pp. 303-315. 


a 
3 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 743 
Every linear combination of 7, (32) and (33) which is identical with a multiple 
of 8 =i+3j+6k is a multiple of 

(34) w(32) + 8(33) + (2a)* = 120ws®, 4M = 7w — 3g° — 328. 


In the left member of (34) replace each exponent 6 by 4. By (28) and (29), the 
resulting function becomes 


(24w + 8-12g2h?)s? + 6-8(g? — + (M — w) 
The sum of the last two parts will be zero if 
4(g* — + w — — — — g* — = 


But if the last factor is zero when g and #/ are integers, either g2=4, h=0, or 
vice versa, whence M is negative. For a Waring problem, M 20. Hence g?=h? 
and (34) becomes the double of 


(35) g°(32) + 8) + gb)® + g* (2a)® = 120gs°, 
12 

while the like sum of fourth powers is equal to (24g°+48¢*)s?. 
When g=1, (35) is Kempner’s* identity 

(36) + + = 120s. 

8 12 4 
The corresponding sumf of fourth powers was seen to be 72s?. That of squares 
is 60s. For arbitrary u, v, w, write 


(37) f(x) = + vxt + wx?, g(x) = 120ux* + 720x? + 60wx. 


Hence 


(38) g(s) = flat b) + Df(2a). 
8 12 a 


If we take d=0 in (38), we see that g(a?+5?+-c?) is a sum of 107 values of 
f(x). If we take d=c, and note that f(c—d) becomes f(0) =0, we see that 
g(a?+b?+ 2c?) isa sum of 100 values of f(x). Every positive integer not of the 
form h=4*(16m+-14) is represented by a?+5?+2c?. But / is not of the form 
4*(8n+7) of the only positive integers not represented by a?+b?+c?. This 
proves 


THEOREM 14. If 7 is any positive integer, q(j) is a sum of 107 values of f(x) 
for integers x. 


18. We identify g(x) in (37) with the product of (23) by a rational con- 


* Dissertation, Géttingen, 1912. Extract in Mathematische Annalen, vol. 72 (1912), p. 396. 
+ Directly by adding (29) to the product of (12) by 8. 


: 

4 

| 

2 

3 

idl 


744 L. E. DICKSON [October 


stant k, and insert the resulting values of u, v, w into f(x). The latter now 
has the denominator 720. Hence we write k=720 N/D, where the integers 
N and D are relatively prime. We get 


q(x) = 72 D (x), f(x) - D (x), 


(39) 
F(x) = Ax* + 5Bx* + (12C — 2A — 6B)x?. 
In a Waring theorem with summands f(x), the value of f(x) is assumed to 
be integral for every integer x. Hence D divides F(x) for every integer x. We 
have 


(40) F(x) = A(x* — x?) + 5B(x* — x?) + Ex?, E=12C —A — B, 
(41) F(1) = E, F(2) = 4(15A + 15B + E), F(3) = 9(80A + 40B + E). 


Hence D divides E, 60(A +B), 360(2A+8B) and therefore their combina- 
tions, 720C, 360A. Since D divides the products of A, B, C by 720 and since 
1 is a linear combination of A, B, C by (24), we see that D divides 720. 

By §15, every sufficiently large integer is a sum of nine values of P(x). 
Hence by (39), every large multiple of V(720/D) is a sum of nine values of 
q(x). Then by Theorem 14, the same multiple is a sum of 9X107 values of 
f(x) =NF(x)/D. This statement is evidently equivalent to the case V = 1 of it 
and hence to Lemma 5. 


Lemma 5. Let D divide all the values of F(x) in (22) for integers x. Then D 
divides 720. Every sufficiently large multiple of 720/D is a sum of 963 values 
of F(x)/D. 

This implies 

Lemma 6. Let L be the least positive integer such that every integer is con- 


gruent modulo 720/D to a sum of L values of F(x)/D. Then every sufficiently 
large integer is a sum of L+963 values of F(x)/D. 


19. We seek the number corresponding to Z when the modulus is one of 
the relatively prime factors 5, 9, 16 of 720. We are obliged to go into details 
to obtain facts which overcome the difficulty that congruent arguments need 
not yield congruent values of a polynomial whose coefficients are not integers 
(§21). 

If £ is prime to 5, (41) shows that the sums by two of the values of F(0), 
F(1), F(2) are congruent modulo 5 to 0, E, 2E, 3E, 4E, whence every integer 
is congruent to a sum of two values of F(x). But if E is divisible by 5, (40) and 
x° =x show that F(x) =0 (mod 5) for every x, and we employ F(x)/5. 


4, 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 745 


Modulus 9. Evidently F(9+x) =F (x). Hence all values of F(x) are con- 
gruent to those with x=0, 1, 2, 4. But 3(A4+B)=—3E. Hence 


F(2) = 7E, F(4) = 16(255A + 75B + E) = 4E. 
If £ is prime to 3, every integer is congruent to a sum of three values of 0, E, 
4E, 7E modulo 9. 
Next, let E=3k, where k is prime to 3. Then x(x?—1) and hence also 
F(x) is divisible by 3 for every x. We see that F(x)/3=0 or k according as x 


is or is not divisible by 3. Thus every integer is congruent to a sum of two 
values of F(x)/3. It follows also that, for all integers x, /, 


(42) aF (x + 37) = 3F(x) (mod 3). 


But if £ is divisible by 9, we employ F(x)/9. 
20. Modulus 16. Evidently F(8+x) =F(x). Also, 4(A +B) = —4E. Hence 
every value of F(x) is congruent to one of 


(43) F(0) = 0, F(1) = E, F(2) = 8E, F(3) = 9E + 8B. 


Case B even. We may drop the term 8B from (43). If £ is odd, every in- 
teger is congruent to a sum of 7 values of F(x). 

Let E=2m. Then F(x) is always even. Also F(x)/2=0 or m (mod 8) ac- 
cording as x is even or odd; and hence 


(44) 3F (x + 27) = 3F(«) (mod 8), for all x, 7. 


When m is odd every integer is congruent modulo 8 to a sum of 7 values of 
F(x)/2. If E=4M, where M is odd, we require a sum of 3 values of the resi- 
dues 0, M modulo 4 of F(x)/4 which is always integral. If E=8M, M odd, 
every integer is congruent modulo 2 to one of the values 0, M of F(x)/8, which 
is always integral. Finally, if EZ is divisible by 16, we use F(x)/16, which is 
always integral. 

Case B odd. If E is odd, then F(3) =Z£, and we use a sum of 7 values of 
F(x). If E=2m, where m is odd henceforth, the values of F(x)/2 are =0, m, 
5m (mod 8), a sum of four of which yields every residue. Also, 


(45) 3F (x + 87) = 3F(x%) (mod 8), for all x, 7. 
: Let E=4m. Then, F(2y)/4=0 and 
tF(1) = = m, = = —m (mod 4), 
a sum of two of which yields every residue. Here 
(46) iF (x + 4) = — iF(x) (mod 4), x odd; 
{ (47) 1F(x + 8j) = 3F(x) (mod 4), all x, j. 


2 

4 
: 
i 

¥ 
i 

| 
] 

| 
] 
| 

£ 

4 


746 L. E. DICKSON [October 


Let E=8m. Then F(x)/8=0 (mod 2) unless x= +1 (mod 8) and then 
F(x)/8=1 (mod 2). Here 


(48) 3F (x + 4) = 1+ 4F(x) (mod 2), x odd; 
(49) («% + 87) = $F (x) (mod 2), all x, 7. 


21. It remains to pass from our relatively prime moduli to the product as 
modulus. In case a polynomial p(z) has integral coefficients, the classic 
method is to employ the Chinese remainder theorem and note that 


(50) z = a (mod M) implies p(z) = p(a) (mod M). 


By (42), (44) and (45), property (50) holds also for our corresponding 
polynomials having denominators. There remain the cases M=4, 2 when B 
is odd. In view of (47) and (49), we have only to apply the Chinese remainder 
theorem when one congruence is z=a (mod 8) instead of z=a (mod M=4 
or 2). 

We have now proved 


THEOREM 15. Let D be the largest integer which divides all the values of F(x) 
in (40). Then D is the g.c.d. of 720 and E. Then in Lemma 6, LS7, so that 
every large integer is a sum of L+963<970 values of F(x)/D. We have L=7 


if E is odd, or if B is even and E=2m, where (as below) m is odd. Next, L=4 
if Bis odd and E=2m. Again, L=3 if Bis even and E=4m, or if Eis prime to 3 
and divisible by 4. Also, L=1 if E is divisible by 360. In all the remaining cases, 
L=2. 


22. Examples. By Theorem 12 every positive integer is a sum of nine 
values of P(x) = (2°+2x)/3, viz., (17) for e=2. Hence A =2, B=0, C=1, and 
F(x) =2x°+8x?. Here E=10, D=10, L=7. By Lemma 6, every integer is 
congruent modulo 720/D =72 to a sum of seven values of Q(x) = (a*+4x?)/5. 
Since 0, +1, - - - , +35, 36 form a complete set of residues modulo 72, every 
integer =>70(36) is a sum of 970 values of Q(x). Successive values of Q(x) are 
1, 16, 153, 832, 3145, 9360, 23569, 52480, O(9) = 106353. In Lemma 1, the 
successive values of g(x) are 16, 9, 5, 3, 2, 2, 2, g(8) =2, whence every integer 
<Q(9) is a sum of 41 values of Q(x). But Q(4+1) <2Q(x) if x29, whence 
q(x) =1. Hence 41-+27 =68 suffice to 0(36), and 6+68 =74 suffice to 70(36). 


THEOREM 16. Every positive integer is a sum of 970 (integral) values of 
+422) /5. 

Next, employ Theorem 12 with e=3. Then A =3, B=0, C=1, E=D=9, 
L=7. We see that 87 suffice to 7H (40), where H(x) = («°+2x*)/3. Finally, if 


1934] WARING THEOREMS WITH POLYNOMIAL SUMMANDS 747 
e=1,A=C=1, B=0, E=11, D=1, L=7; g(1) =9, g(2) =7, g(3) =5, g(4) =3, 
g(x) =2 ifx=5-8, g(x) =1 if x29. 


THEOREM 17. Every positive integer I is a sum of 970 values of (x®+2x?)/3. 
If 1=7F (360), I is a sum of 970 values of F(x) =x*+10x*; if I< 7F(360), J 
exceeds a sum of 389 values of F(x) by an integer which is=0 and <10. 


Various similar theorems are omitted for brevity. 


Part IV. WARING THEOREMS FOR CERTAIN POLYNOMIALS 
OF DEGREES 8 AND 10 


23. Employ the notations 
H(x) = + vx4 + 
f(x) = 5040ux* + 7200%? + 504wx. 


Then we have the identity* in a, b, c, d, u, v, w: 


f(s) = VA(2a t b+ c) + OD H(2a) + 6D d) 


(51) 


+ 60>°H(a + d). 
12 


When v=w=0, the identity is due to A. Hurwitz.f When u=w =0, it follows 
from (27;), (29), (1s). 

Take w=0. Then f(x) =720(7ux‘+vx?). To apply Theorem 10, let 74 and 
v be relatively prime and 7u+0 be odd. Let 6 be the g.c.d. of the latter and 3. 
Hence every sufficiently large multiple of 720 is a sum of 51 values of f(x) 
and hence by (51) (with 840 values of H) is a sum of 51 X840 values of H(x) 
or H(x)/3 according as 6=1 or 3. A similar theorem with 51 replaced by 50 
follows from Theorem 4. To apply the better Theorem 1, take u=1/2, 7=7/2. 
Then f(x) =5040(x*+<2*)/2. Hence every large multiple of 5040 is a sum of 
38 X 840 values of («*+7x*)/2. Similarly by Theorem 2, every large multiple 
of 30240 is a sum of 36 X840 values of (x8—7x*)/12. We obtain Waring theo- 
rems as in Lemma 6. 

24. J. Schurf expressed 22680 s° as a sum of tenth powers. Replacing each 
exponent 10 by 4, we see that the sum becomes the sum of (27;) and the prod- 
ucts of (29) by 9 and (12) by 180, and hence is 1512 s*. But if we replace 
each exponent 10 by 6 or 8 we do not obtain a multiple of s* or s*. Hence 


* It does not hold if H has a term in 2. 
+ Mathematische Annalen, vol. 65 (1908), pp. 424-7. 
t Mathematische Annalen, vol. 66 (1909), p. 105; History of the Theory of Numbers, II, p. 721. 


¥ 
2 
a 


748 L. E. DICKSON 


1512(15us°+ vs?) is obtained from Schur’s sum by replacing each by 
ux'°+yx4, A Waring theorem for the latter may therefore be deduced from 


one for* 15ux°+vx?. 


* Maillet, Journal de Mathématiques, (5), vol. 2 (1896), pp. 363-380, proved that every large 
integer is a sum of a limited number of 1’s and 192 values of any polynomial in x of degree 5 which is 
a positive integer for all integers x2 g. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


DERIVED NUMBERS WITH RESPECT TO 
FUNCTIONS OF BOUNDED VARIATION* 


BY 
R. L. JEFFERY 


1. Introduction. The present paper is asupplement to our previous paper. 
It deals with the distribution of values of the derived numbers of a function 
F(x) with respect to a function of bounded variation a(x), and the possibility 
of determining F from one of these derived numbers when the derived number 
in question is finite. That F can be determined from its derivative with re- 
spect to a when this derivative is finite has been shown by Lebesgue,f and 
also by the present writer.§ The method used by Lebesgue involves a trans- 
formation which reduces integration with respect to a function of bounded 
variation to ordinary integration. That of the present writer is direct. Le- 
besgue remarks|| concerning his method that it does not seem suitable for 
handling the corresponding problems which arise when the derivative with 
respect to a is replaced by derived numbers with respect to a. We have found 
that these problems will yield to the direct method of treatment, but the 
analysis is complicated. In the present paper we use a transformation {] which 
is different from that of Lebesgue, but which, like his, reduces the operations 
of differentiation and integration with respect to a function of bounded varia- 
tion to the corresponding operations in the ordinary sense. 

We shall be concerned with functions F which are constant on intervals 
throughout which a is constant, and for which F(*«—0) and F(x-+-0) both 
exist at the discontinuities of a, and consequently at the discontinuities of w, 
the variation function of a. At the points of discontinuity of w let ¥(x, h) 
= | F(x+h) —F(x $0) } /mw(x, h), $ holding according as 420. At the points 
of continuity of w, let ¥(x, h) = { F(«+-h) —F(x) }/mw(x, h) when mw(x, h) #0, 
¥(x, h) =O when mw(x, h) =0. Then the upper and lower limits of (x, h) as h 


* Presented to the Society, April 15, 1933; received by the editors June 11, 1933, and, in revised 
form, March 19, 1934. 

t Non-absolutely convergent integrals with respect to functions of bounded variation, these Transac- 
tions, vol. 34, pp. 645-675, the notation of which is carried throughout the present paper. The paper 
cited is referred to in what follows as T. It contains some typographical errors: p. 656, line 16, D.F 
should read | D,F|; the numerator of the last inequality on this page should read | F(xe41—0) 
—F(x,—0)| ; the left side of the first inequality on p. 657 should read | >> F(x%41—0) —F(x.—0)|. 

t Lecons sur l’Intégration, Paris, 1928, pp. 296-307. 

§ T, p. 657, Theorem IX. 

|| Loc. cit., p. 307. 

{{ This transformation was suggested by S. Saks. It simplified essentially our discussion. 


749 


be 

a 

i 

A 
4 

4 
4 
re 


750 R. L. JEFFERY [October 


tends to zero through positive values and through negative values respectively 
are the upper and lower right and left derived numbers of F with respect to w, 
D.F+, D.F +, D.F-, D.F-. This set of derived numbers we designate by A.F. 
If these derived numbers are all equal then their common value is the deriva- 
tive of F with respect to w, D.F. If w is the variation function of a then 
D.a=g= +1, except for at most a set of w-measure zero.* At points where 
g= +1 we define the set A.F by the relation A.F =A.F/g. Where g is different 
from +1 the set A.F is determined by considering the various limits as h 
tends to zero of the ratio { F(x+h) —F(x)}/{a(x+h) —a(x) }. If the limit as 
h tends to zero of ¥(x, h) is equal to a, for x +h taking on any values except 
those of a set of w-density zerof at x, then a, is the approximate derivative 
of F with respect to wy ADF, and where g= +1, AD.F =AD.F/g. 

2. The distribution of the values of the derived numbers of F with respect 
to a. Let w be the variation function of a, and e, the points of (a, 6) at which 
w is continuous and which do not belong to intervals throughout which w is 
constant. If y=w(x), then according to our previous conventionsf{ the set x; 
of discontinuities of w go into a countable set of open intervals B;=(b/ , b/’) 
on the interval {w(a), w(b)} =(u, v), and the countable set of intervals a; 
throughout which w is constant go into a countable set y; on (u, v). Then, to 
each value of y on this closed interval, except the set y; and the end points of 
8;, there corresponds a single point x, on (a, 6). For such values of y let 
¢(y) =F(x,). At a point of the set y; let ¢(y) have the constant value of F 
on the corresponding interval of the set a;. At the end points 5}, b/’ of the 
intervals 6; let (b/ ) =F (x;—0), and $(b/’) =F(x;+0). 

The function ¢(y) is now defined at every point of the closed interval 
(u, v). Let y. be the set w(e.). At almost all points y of y. the density of the 
set 8; is zero. At such a point y let us compare the various limits as Ay tends 
to zero of the ratio 


o(y + Ay) — o(y) 
Ay 


(1) 


with the corresponding limits as / tends to zero of the ratio 


+ h) — F(x) 


mua(x, h) 


(2) 


* Daniell, these Transactions, vol. 19, p. 361. The result there given evidently holds under the 
present definition of a derivative. 

t+ T, p. 662, where right hand w-density of the set E is defined. w-density is the limit as 4 tends 
to zero of the ratio ME(x—h, x+h)/mo(x—h, x+h). 

t T, p. 646, 1. 


1934] DERIVED NUMBERS 751 


If y+Ay is a point of y, and x,+4 is the corresponding point of e., then the 
two ratios are the same. This is also the case if y+Ay is a point of ;, and 
x,y+h is on the corresponding interval a;. Let y+-Ay be a point of 8;. Then 
if x, +h=x; we have 


(3) F(xy + h) = o(y + Ay), mw(x, h) = Ay + k, 


where | ¢;| <m;. Consider the ratio 


(4) 
Now 


(5) 


Ay|~ |ay| 


And since at the point y the density of the set of intervals 8; is zero, it follows 
that 
mB; 1 
mB; mB; 
tends to zero with Ay. As a result of this, and the fact that |¢;|/m8;<1, we 


conclude that m§,/|Ay| tends to zero with Ay. Relations (4) and (5) then show 
that 


mw(x, h) 
Ay 


It then follows from (3) and (6) that if y+Ay is a point of 8;, and «,+h=<,, 
where x; is the point of discontinuity of w corresponding to the interval 8,, 
then the ratios (1) and (2) have the same limits of indetermination. 

It remains to consider the case in which y+Ay is an end point of 8;. Let 
yt+Ay=6/. Then ¢(y+Ay) =¢(b/) =F(x;—0). Let x; be a sequence of values 
of x belonging to e, or to a;, and tending to x; from the left. Then F(x:) tends 
to F(x;—0) =¢(y+Ay), and if x,+h=2x, then mw(x,, h) tends to Ay. Thus 
for / sufficiently large the ratio (2) is arbitrarily near to the ratio (1). A like 
manner of reasoning may be used to show that the same situation prevails 
when y+Ay=5}’. 

We have now proved that every value that is approached by the ratio (1) 
as Ay tends to zero, is also approached by the ratio (2) as / tends to zero over 
a suitably chosen sequence of values of h. Starting with the ratio (2) and 


Ma\X, h) A t; t; 
Ay Ay Ay 
t; mB; 

| 
a id 


752 R. L. JEFFERY [October 


letting 4 tend to zero through all possible values, it can be shown by reasoning 
similar to the above that for every limit approached by (2) there is a sequence 
of values of Ay tending to zero over which the ratio (1) approaches the same 
limit. It then follows that, except for a part of e, of w-measure zero, the dis- 
tribution of the values of the set A. at the points of e, is the same as the dis- 
tribution of the values of the derived numbers of ¢(y) at the points of y,. 
At the set x; of discontinuities of w, DF exists and is finite. If w is the varia- 
tion function of a, then where D.a =g = +1, A. =A.F/g. This relation holds 
except for a set of w-measure zero. At a point where g= —1 an upper derived 
number with respect to w may correspond to a lower derived number with 
respect to a, and conversely. But where one is finite the other is also. Further- 
more, if the function F is measurable relative to a* on (a, 6), then the func- 
tion ¢ is measurable on (yu, v). Consequently, if we take into consideration the 
known facts concerning the distribution of the values of the derived numbers 
of measurable functionst we have the following result: 


Let a(x) be a function of bounded variation on the interval (a, b), and let 
the function F(x) be finite at each point of (a, b), measurable relative to a on 
(a, 6), constant on intervals throughout which a is constant, and such that at the 
points of discontinuity of a, F(x—0) and F(x+0) both exist. Then, except for 
at most a set of a-measure zero, the derived numbers and approximate derivatives 


of F with respect to a fall into one or the other of the following classes: 
(1) AD.F exists and is finite. 
(2) AD.F+ = = + ©, = ADF. = — ©. 
The points of class (1) are of four types: 
(1.1) D,F exists and is finite. 
(1.2) DF, = = DaF-, = 
(1.3) = ADF = DaF-, Dal’. = 
(1.4) = = 20, Daly = 
3. The determination of F(x) by means of the derived numbers of F with 
respect to a. In this section, in addition to the conditions imposed above, F is 
continuous where a is continuous, and, at points of discontinuity of a, F(x) 


lies on the interval defined by F(x—0) and F(x+0). Furthermore, the region 
of definition of F is extended beyond the interval (a, 6) in such a way that 
* T, p. 646, §1, p. 655, §6. 
t J. C. Burkill and U. S. Has'am-Jones, The derivates and approximate derivates of measurable 
functions, Proceedings of the London Mathematical Society, (2), vol. 32, pp. 346-355. 


1934] DERIVED NUMBERS 753 


F(a—0) =F(a), and F(6+0) =F(0). Let w be the variation function of a. On 
the interval w(a) =~ <y<v=w(b) let be the function ¢ defined in the 
previous section, except for the intervals 8;, where y is linear, ranging from 
F(x;—0) to F(x;) on the left half of 8; and from F(x;) to F(«;+0) on the right 
half of this interval. We prove the following: 

If D.Ft is finite at each point of (a, b), then D+ is finite at each point of 
(u, v), with the possible exception of the right hand end points of the intervals B;. 

On the intervals 6; the function y is linear, and consequently Dy* is finite 
at each point y for which b/ <y<b/’. For y a point of y. and y+Ay a point 
of y. or y;, the limits of indetermination of the ratios 


¥(y + Ay) — ¥(y) ail F(xy + h) — F(xy) 
Ay muw(xy, 


are the same provided x,+h is so chosen that y+Ay=w(x,+h). The same 
statement holds if y is a point of y;, provided x, is the right hand end point 
of a;, and F(x,) is replaced by F(x,—0). Hence, if D.F* is finite, either the 
upper limit of the first ratio is finite or this ratio becomes positively infinite 
as Ay tends to zero with y+Ay on intervals of the set 6;. Let y+Ay be on 
b;’"); let bf ~y=A’y, Ay=A’y+t/, and let mB;=/;. There are then 
two cases to consider: (i) A’y/t; bounded from zero; (ii) A’y/t; tending to zero 
as A’y tends to zero. In case (i) the ratio Ay/Ay lies between the two ratios 


F(x; + 0) — F(x,) F(x; — 0) — F(x,) 
and 


Ay +t! 
i.e., between 
F(x; +0) —F(x,) /A’y + tf 4 F(x; — 0) — F(x,) /A’y + 
an 
A’y + A’y +4; A’y A’y 


Since A’y+t;=mw(x,, xi), A’y=mw(x,, x;—0), and since is finite, it 
follows that the numerators of these last two expressions are bounded above,,. 
and since A’y/t; is bounded from zero, it follows that their denominators are 
bounded from zero. Thus it has been shown that D+ < ©. It remains to be 
shown that Dy+>—o. If the ratio Ay/Ay becomes negatively infinite for 
every sequence of values of Ay, then the ratio AF/Aw becomes negatively 
infinite for x, +h points of e, or a;. Hence, since D,F* is finite, we must have 
AF /Aw tending to a finite limit for x,+4 points of the set x; of discontinuities 
of w. In this case, 


AF _ F(x) — F(%y) ¥(y + Ay) — ¥(y) 


mw(xXy, h) muw(Xy, Xi) mu(Xy, Xi) 


4 
% 
¥ 
4 
4 
“4 


754 R. L. JEFFERY [October 


where Ay=A’y+1t;/2 and mw(x,, x;) =A’y+#;. But this makes Ay/mw(x,, x;) 
=4. Hence, if Ay/Ay becomes negatively infinite, so does Ay/Aw and its 
equivalent ratio AF/mw(x,, x;), from which it follows that D,F+ = —o. But 
this is a contradiction. We can, therefore, conclude that when A’y/t; is 
bounded from zero, Dy* is finite. 

Remark. In case (i) if the function y is defined as above, except that it is 
a single linear function on 6; ranging from F(x;—0) to F(x;+0), the same 
conclusions hold in regard to Dy+. It was this definition of y that was sug- 
gested by Saks. But if y is defined in this manner there exist functions F and 
w such that in case (ii) DF + is finite and D+ = — ©. We exhibit such an ex- 
ample. It throws light on the whole situation. 

On the interval (0, e) let 


1 1 
=> and let w(x) = >> On wn S x < %p-1. 
i=n 1: i=n 1: 
It is easily verified that if 8, =w(x,—0) <y<w(x,+0) then for y=0, A’y/t; 
tends to zero as A’y tends to zero, which is the condition of case (ii). Let 
F(0) =0, and on x, <x<2x,_, let 
F(q-1) — F(0) 


Ft) = - —; then — 1, 
) 1! mw(0, Xn—1) 


which shows that D,F* is finite. Also 


mw(0, Xn-1 + 


which becomes negatively infinite as m increases. Hence D.F,=— 2. Now 
let y be linear on 8, and range from F(x, —0) to F(x,+0). Then if y=0, and 
y+Ay is on 8,, 


6,(n) + n(n — 
62(n) + n! 


since ¢/’/t =n. The functions @,(m) and 62(m) tend to unity as m increases. 
From this it follows that the last member of the foregoing equality, and 
consequently Ay/Ay, becomes negatively infinite as m increases. Hence 


= 1 1 1 
1+—+——_+ --- 
1! n—1 n(n—1) 
a | i! n n(n + 1) 
1 
——i}’ 
Ay 
t=n i! 


1934] DERIVED NUMBERS 755 


We now show that for the function y defined at the beginning of this sec- 
tion we have in case (ii), just as in case (i), that D,F*+ finite implies Dy* finite. 
If D,F* is finite and Dy+ = +-«, then Ay/Ay must become infinite for y+Ay 
on 6;. We then have 


Ay _ + Ay) — + 4'y) <2 ti 
Ay if 
+ A’y) vo) + 
A'y 


The denominators of both ratios on the right are greater than or equal to 
unity, and since D,F*+ is finite the numerator of the second ratio is finite. 
Hence if Ay/Ay becomes positively infinite so does the numerator of the first 
ratio. If we set ¥(y+Ay) then ¢/’/t/ becomes infinite, and 


_ P(x — 0) — F(%y) 
A’y + 


=A+B. 


Let us compare A and B with 


F(x; — 0) — F(xy) 
A’y + 


and B’ = 


where t/’ <F(x,), and <r} <i,/2. If A is negative then A’>4A, and if 
A is positive, A’>0. Since = it follows that B’2>B. A’y/t; 
tends to zero, and since r/’/r/ becomes poatively infinite, it follows that B’ 
becomes infinite as A’y tends to zero and r/ tends to ¢;/2. Hence, if Ay/Ay 
becomes positively infinite, so does A’+B’ for r/ =t;/2. But this means that 
the ratio {F(x,) —F(x,)}/t; tends to +0. Then, since A’y/t; tends to zero, 
it follows that { F(x;) —F (x,y) } /mw(x,y, x;) tends to +. But this means that, 
at the point x,, D.F+=-+ 0, which is a contradiction. We conclude, there- 
fore, that Dy+<. The proof that DJ+>—© is the same as in case (i). 

We now know that if, at each point of (a, 6), D.F* is finite, then, at each 
point of (u, v), Dy* is finite, except possibly the right hand end points of the 
intervals 6;, at which points the left hand derivative of y exists and is finite. 
Hence if, at each point of (a, 6), D.F* is finite, then at each point of (yu, v) 
one of the derived numbers of y is finite. At the points of y., except at most a 
null set, this finite derived number can be taken as Dy+, and will be equal 
to D,F+ at the corresponding points of e., which is all of e, except at most a 
set of w-measure zero. Furthermore, on an interval of the set 6;, 


Ay tf’ 

Ay A'y + tf 

Aly + 


R. L. JEFFERY [October 


Dytdy = — ¥(b!) = F(x; + 0) — F(x: — 0) = ff 


Let a’ <x<a”’ be an interval on (a, 6) with (a’, a’’) points of continuity of w, 
and b’=w(a’)<y<b/’ =w(a’’), the corresponding interval on (u, v). Let 
f(y) =Dy* where Dy* is finite, and otherwise let f(y) be the left hand deriva- 
tive of y. Then 


S(y)dy = ¥(b") — ¥(0') = F(a") — F(a’), 
y 


where the integration is in the sense of Denjoy.* Since at almost all of y, the 
function f(y) =D.F+ at the corresponding points of e., and since the integral 
of f(y) over 8; is equal to the integral of D.F over x;, it follows that D,F*+ is 
Denjoy integrable with respect to wt on (a’, a’’), and that 


f D.F+dwe = f S(y)dy = F(a”) — F(a’). 
a’ b’ 


Now let (/, m) be any interval on (a, d), (a,’, a’) a sequence of intervals for 
which/< --- <aj’ <aj’ < --- <m, wherea,’ and a,’ are points 
of continuity of w, a,’ tending to/ and a,’’ tending to m. Then 


F(m — 0) — Fi + 0) = lim D.Ftdw = f D.Ftdw, 
l<zem 


20 
ay 


where the integration with respect to w is in the sense of Denjoy. Also 


Fi +0) —F@ —0) = 


F(m + 0) — F(m — 0) = f D.Ftde. 


By putting /=a and x=™m these results then permit us to state the following 
theorem: 

If f(x) is finite at each point of a<x<b, and is the upper right derivative 
with respect to a non-decreasing function w, where F satisfies the conditions laid 
down at the beginning of this section, then if x is any point on (a, b), 


* Lebesgue, loc. cit., p. 150. 
+ T, p. 665, §10. 


756 
iu 


1934] DERIVED NUMBERS 


0)- Fla) =f F(x +0) F(a) = f sas, 


ast<z 
where the integration with res pect to w is in the sense of Denjoy. 


In a similar manner the same result can be established for any of the 
other derived numbers of F with respect to w. If it is known that f is equal 
to one of the derived numbers of F with respect to w, not necessarily the same 
derived number at each point, and if further for the intervals a; throughout 
which w is constant f is a right hand derived number at the upper end, or a 
left hand derived number at the lower end, then it follows by reasoning simi- 
lar to the above that one of the derived numbers of y is finite at each point 
of (u, v), and this in turn leads to the truth of the foregoing theorem in the 
present case. 

If a is a function of bounded variation on (a, 6) and w the variation func- 
tion of a, then where g = +1=D.a, A.F =A.F/g. Hence, if at such points one 
of the set A,F is finite, then one of the set A.F is finite. Where g is different 
from +1, the set A,F is determined from the ratio AF/Aa. It is not difficult 
to construct functions F and a for which D,F*+ is finite, D.Ft+=+o, 
D.F = — «©. If, however, at this exceptional set for all | h| sufficiently small, 
Aa=a(x+h)—a(x) does not change sign unless / changes sign, it is easily 
shown that one of the set A.F finite implies one of the set A..F finite. When 
the function a@ satisfies this condition, we have 


If the function f(x) is finite at each point of the interval a<x<b and equal 
to one of the derived numbers of F with respect to a, for the intervals of the set a; 
throughout which a is constant either a right hand derived number at the upper 
end or a left hand derived number at the lower end, then 


F(x — 0) — F(a) = f 


a<t<z 


F(z +0)- F(a) =f 


as<t<z 
where the integration with res pect to a is in the sense of Denjoy.* 


At the points for which g= +1, f=6/g where @ is one of the derived 
numbers of F with respect to w. If we set 0,(x) =0(~) where g= +1, and ( 
equal to a finite derived number of F with respect to w at the remaining points 
of (a, 6), then on the intervals a <¢<x and a<t<z, the function 6, is integra- 
ble in the sense of Denjoy with respect to w to the values F(«—0) —F(a) and 


* The integral of fda is defined as the integral of fgdw, in whatever sense the latter integral exists, 
T, p. 655, §6. It is stated, T, p. 675, §13, that if the integral of fdw exists in the sense of Denjoy then 
the integral of fgdw exists in the same sense. Obviously, this is only necessarily true when f is sum- 
mable with respect to w, since g may be negative where f is negative and positive where f is positive. 


on 

757 


758 R. L. JEFFERY 


F(x+0)—F(a) respectively. Hence for the interval (a, x) we have, since 
6= 6, except for at most a set of w-measure zero, 


= fede = ff eae = F(x ¥ 0) — F(a), 


F holding according as the integration is taken over the interval aSt<zx, 
or a<t<zx. This establishes the theorem. 


ACADIA UNIVERSITY, 
WOLFVILLE, Nova Scotia 


PROBABILITY AND STATISTICS* 


BY 
J. L. DOOBt 


The theory of probability has made much progress recently in the direc- 
tion of completely mathematical formulations of its methods and results.f 
The purpose of this paper is to make a further contribution in this direction. 
In order to analyze the results of repeated trials of an experiment, a certain 
space of infinitely many dimensions is the proper tool. This space is discussed 
in the first section of the paper. In the second section, the results of the first 
are applied to obtain for the first time a complete proof of the validity of the 
method of maximum likelihood of R. A. Fisher, which is used in statistics 
to estimate the true probability distribution when the results of a repeated 
experiment are known. 


1. THE SPACE Q(F) 


It will be seen that the space 2(F) described below provides the natural 
basis for the analysis of experiments with repeated trials. The preliminary 
facts, which are not new, will be stated in the form of a theorem. 


THEOREM 1. Let F(x) be a monotone non-decreasing function, defined for 
— 2 <x< 0, and satisfying 


(1) F(x — 0) = F(x), lim F(x) =1, lim F(x) = 0. 


There is a o-field§ of point sets on the x-axis, including all Borel measurable sets, 
and a completely additive non-negative set function pp(A) defined on this o-field, 
such that if I is any interval asx <b, pr(I) =F(b) —F(a).|| 


* Presented to the Society, March 31, 1934; received by the editors April 11, 1934. 

National Research Fellow. 

t Cf. the treatment of A. Kolmogoroff, Ergebnisse der Mathematik, vol. 2, No. 3: Grundbegriffe 
der Wahrscheinlichkeitsrechnung. 

§ A field is a collection of point sets with the property that if A and B are sets in the collection, 
A+B, A—A-B, AB are also. A field is a o-field if whenever A1, A2, « - + is a sequence of sets in the 
field, >> j14 ; is also in the field. It will then follow that [] 3:4; is in the field. A set function p(A) 
defined on the sets of a o-field is completely additive if when A1, A2,+++ is asequence of disjunct sets 
in the field, p 9(4)). 

|| The sets in the field of definition of pp will be called measurable with respect to F(x). If A is 
measurable with respect to F(x), pr(A) is the variation of F(x) over A. The definitions of functions 
measurable with respect to F(x) and of their integration are formulated in the usual way, giving the 
Lebesgue-Stieltjes integral. 


759 


760 , J. L. DOOB [October 


Let Q(F) be the space whose points are the sequences ( - - ,X-1, %1, °° 
where x; is any real number. There is aa-field of point sets of Q(F), including all 
sets determined by conditions of the form 


(2) x G =0,+1,---), 


where the point sets E,, Ex, - - - are measurable with respect to F(x) and a com- 
pletely additive non-negative set function P p(A) defined on this field, such that if A 
is of the type (2), 


(3) Pr(A) = I pr(Ej). 

The sets in the field of definition of Pr will be called measurable with re- 
spect to F(x); the measurability (with respect to F(x)) and integration of 
functions defined on Q(F) are then defined in the usual way. This space was 
first discussed by Daniell.* 

It should be noted that if ¢(w) is a measurable function on Q(F), and if 
¢(w) depends only on x;: ¢(w) =f(x:), f(x) is measurable with respect to F(x), 
and 


(4) f = 


where the existence of either integral implies that of the other. 

This space Q(F) is introduced as a tool in the rigorous analysis of certain 
ideas in the theory of probability. Let F(x) determine a probability distribu- 
tion, i.e. we suppose that there is a chance variable # such that the probability 
that <x is F(x). Then (1) is satisfied. If a single trial is made, pr(A) is the 
probability that the value of x obtained will be in the set A. If a finite suc- 
cession of trials is made, obtaining values ¢,, - - - , ,, and if A is a point set of 
Q(F) on which P, is defined, Pr(A) is the probability that there is a point 
that «;=£;,7=1, ---,. The usual interpreta- 
tion if A is a set of the form (2) is obvious. The advantage of this point of 
view? is that the set-up is independent of the number of trials. Chance varia- 

* Annals of Mathematics, (2), vol. 20 (1919), pp. 281-288. Daniell actually only considered the 
space whose points are sequences of the form (x1, x2, +--+ ), but the treatment of 2(F) could be carried 
through in the same way. These considerations concerning the space 2(F) can be considered as a 
particular case of a general treatment given by Kolmogoroff, loc. cit., pp. 24-30. 

t A similar point of view was taken by A. Khintchine, Zeitschrift fiir angewandte Mathematik 
und Mechanik, vol. 13 (1933), pp. 101-103, who treated the case of a chance variable which only 
takes on the values 1 or 0 (making less restrictions on P(A) however). This space was used for the 
same purpose by E. Hopf, Journal of Mathematics and Physics of the Massachusetts Institute of 
Technology, vol. 13 (1934), pp. 51-102. The place of these methods in the theory of stochastic proc- 


esses was discussed by the writer iu the Proceedings of the National Academy of Sciences, vol. 20 
(1934), pp. 376-379. 


1934] PROBABILITY AND STATISTICS 761 


bles become measurable functions on Q(F), and their integrals on Q(F) are 
their expectations. The law of large numbers will be seen to correspond to the 
ergodic theorem of Birkhoff.* The convergence of a sequence of chance varia- 
bles in probability is simply convergence in measure on Q(F).t 


THEOREM 2. The transformation T of Q(F) into itself, 
T: = =0,+1,---), 


is a one-to-one measure-preserving transformation. If A is a measurable set in- 
variant under T, Pp(A) =0, or Pr(A) =1.§ If 6(w) is any measurable function 
on Q(F) such that face) |*dw exists and such that =e*p(w) for some 
real number i, 


almost everywhere on Q(F). 


The second part of the theorem includes the first part if \=0, and if $(w) 
is considered as the characteristic function of a point set, so only the second 
part of the theorem need be considered. The proof will be given in several 
steps. 

(i) Let F(x) be 0 for x<0, x for OSx<1 and 1 for x>1, and let pr(A) 
and Pr(A) for this F(x) be denoted by po(A), Po(A), respectively. Let Qo be 
the subset of Q(F) consisting of the points ( - - - , x1, 0, 41, ) whose co- 
ordinates satisfy the inequalities 0 <2;<1, 7=0, +1, - - - . It will be shown 
that the general set functions pr(A) and Pr(A) can be derived from po(A) 
and P(A). In fact, let y =F (x) transform the points of the x-axis into points 
of the interval 0<y<1, where if F(x) has a jump at xo, the point xo will be 
made to correspond to the interval F(xo) <y<F(x+0). Then pr(A) is de- 
fined for those and only those sets whose images on the y-axis are Lebesgue 
measurable, and for such sets pr(A) is defined as the Lebesgue measure of the 
image of A. In the same way the set A, on Q(F) measurable with respect to 
F(x) goes over into a set A, on Q on which P(A) is defined, and Pr(A.) 


* Cf. A. Khintchine, loc. cit., and E. Hopf, loc. cit., p. 95. 

} For the definition of convergence in probability, see for instance Kolmogoroff, loc. cit., p. 31. 

t Convergence in measure was defined and discussed by F. Riesz, Paris Comptes Rendus, vol. 
148 (1909), pp. 1303-1305. 

§ If F(x) does not increase, except for equal jumps at x=0, - - + , 9, the set function P(A) has 
a simple interpretation as ordinary two-dimensional Lebesgue measure, and this property (metrical 
transitivity) was proved by W. Seidel, Proceedings of the National Academy of Sciences, vol. 19 
(1933), pp. 453-456. Hopf obtained this result from the second part of the corollary to this theorem 
(see below) by a different method. 


$(w) = f 
Q(P) 


762 J. L. DOOB [October 


= P,(A,). Then it is sufficient to prove Theorem 2 for the space Q and the set 


function Po(A). 
(ii) The set of all complex-valued functions ¢(w) on % whose real and 
imaginary parts are measurable on 2 and such that 


| p(w) 


exists can be considered as the set of elements of a Hilbert space* § if the 
inner product of ¢:(w), ¢2(w) is defined in the usual way as 


1(w)2(w)dw. 


Qo 


It is easily seen that the set of functions of the form 


exp { 
j=l 
where x;(w) is the value of x; for the point w:( - - - , x1, Xo, 41, - - -) and where 
n;, n are arbitrary integers, form a complete orthonormal set of functions in 
§.t If these functions, arranged in some order, are ¢o(w), ¢:(w), - - - where 
$0(w) =1, to every function ¢(w) in corresponds a series > f.0a,6;(w), where 
the coefficient a; is determined by 


(5) 


such that 
Qo j=0 
(iii) Now suppose that ¢(7w) =e®o(w). Then if bo, b:, - - - are the coefti- 
cients corresponding to ¢(Tw), 


(7) b; = eaj, 
and, from the simple form of the transformation 7, if 7 >0, 
(8) = by (3) = 


* For a general reference to Hilbert space see, for instance, M. H. Stone, Linear Trans formations 
in Hilbert Space, American Mathematical Society Colloquium Publications, vol. 15 (especially chapter 
I). The properties of 29 which are needed here (separability, etc., if distance is properly defined), 
are given by Daniell, loc. cit., p. 281. Using these properties the proof that the functions { o(«) } form 
a Hilbert space follows the lines of a similar theorem in Stone, pp. 23-29. 

t If £ is a complex number, ¢ will denote its conjugate. 

t This concept is discussed by Stone, loc. cit., pp. 7-14, where the facts stated below are proved. 


1934] PROBABILITY AND STATISTICS 763 


where r(j) #7. Repeating this we find a sequence of coefficients @m,, dm, °° * 5 
where m,=j, m;=1(m;_-1) if 7>1, whose absolute values are all equal. Evi- 
dently m;~m; if i~j. This contradicts (6) unless a;=0. Then a;=0 if j7>0, 
and $(w) =do, as was to be proved.* 


Coroxtary. (i) If is any integrable function on Q(F), 


1 n 
(9) lim — = 
n 
almost everywhere on Q(F). 
(ii) If $:(w), d2(w) are measurable functions the squares of whose absolute 
values are integrable on A(F), 


(10) lim = { 6,(a)da 
Q(F) Q(P) 


no Q(F) 


(i) This part of the corollary is simply the ergodic theorem in this case.t 

(ii) This part of the corollary corresponds to the extension of the ergodic 
theorem given by E. Hopf, B. O. Koopman and J. von Neumann, to the par- 
ticular case where there are no “angle variables.” { It is obvious when ¢:(w) 
and ¢2(w) each depend only on a finite number of the coordinates of w: 
(- ++, 29, - sincein that case the terms in (10) are equal to the limit pre- 
scribed for sufficiently large values of m. Since any measurable function can 
be approximated by functions depending only on a finite number of coordi- 
nates,§ the general theorem can be reduced to this case. 

The following lemma is needed for the proof of the next theorem. 


Lemma. Let F(x) be defined as in Theorem 1. Define measure on the x-axis 


by the set function pr. Let f(x) be a function defined for almost all values of x and 
measurable (with respect to F(x)). Then if 


(11) lim sup | f(%n) \/n <0 


on set of points w:(---,x0,-- +) of of positive measure, f(x) dF (x) 
exists (as a Stieltjes-Lebesgue integral).|| 


* Stone, loc. cit., p. 10. 

t For a simple proof of the ergodic theorem, following the lines of the first proof, given by 
Birkhoff, cf. A. Khintchine, Mathematische Annalen, vol. 107 (1933), pp. 485-488. In this proof 
the function ¢(x, r) corresponds to the function > j.16(riw) used here. 

t E. Hopf, Proceedings of the National Academy of Sciences, vol. 18 (1932), pp. 204-209; B. O. 
Koopman and J. von Neumann, ibid., pp. 255-263. In these treatments a continuous set of trans- 
formations is considered, instead of the set of iterates of a single transformation as here, but the 
treatment needs no essential change to make it applicable to this case. 

§ Cf. Daniell, loc. cit., p. 283. 

|| The Stieltjes-Lebesgue integral is defined in the same was as the ordinary Lebesgue integral 
except that pr-measure is used instead of ordinary Lebesgue measure. 


764 J. Ll: DOOB [October 


By hypothesis there is a positive number M such that 
| | 
n 


(12) lim sup M 


no 


on a set of points A of Q(F), Pr(A) >0. Let Aw be the point set on Q(F) at 
which 


L.U.B. 


n>N 


n 


Then Aw D> and 
lim Pr(Ay) = 1 — Pp(A) < 1. 


Let E,, be the set of values of x at which f(x) >»M. Then a point w:(- - -, xo, 
- + » ) belongs to the complement of Av if and only if x, is in the complement 
of E, for n=N. Then the complement of Ay is of the form (2), so from (3), 


(13) Pr(Av) = 1 — — 
n=N 

Since limy ..P (Aw) <1, the infinite product is convergent. Then -opr(E,) 
must be convergent,f and it is easily shown from the definition of the 
Lebesgue-Stieltjes integral that this implies that f(x) is integrable (with re- 
spect to F(x)) over the set Eo. Substituting —f(x) for f(x), the proof shows 
that f(x) is also integrable (with respect to F(x)) over the set where it is nega- 
tive. Then /“.f(x)dF(x) exists, as was to be proved. 

The following theorem will be put in the phraseology of the theory of 
probability. Like the lemma, it is simply a theorem on integration on Q(F). 


THEOREM 3. Let %, %2,--- be a sequence of independent chance variables 


with the same distributions. 
(i) If the expectation E of x; exists, then 


1 n 
(14) lm — 
no NM 
with probability 1. 
(ii) If there is a sequence of real numbers C1, C2, --- such that the proba- 
bility is positive that 
* Throughout this paper, if a, a2,--- is a sequence of real numbers, L.U.B. {an} will denote 


its least upper bound. 
t W. F. Osgood, Lehrbuch der Funktionentheorie, vol. 1, 4th edition, p. 528. 


2 


PROBABILITY AND STATISTICS 


lim sup|— — cal < 


n— oo nN j=1 
it follows that the expectation of x; exists, and we have Case (i) again.* 


Let F(x) be the probability that *;<x. If the expectation of x, exists, 


it is, by (4), 
f xj(w)dw = f xdF (x) 
2(F) 


where w is the point ( - 

(i) The first part of the enn § is simply the Corollary of Theorem 2 ap- 
plied to the function ¢(w) =x0(w). 

(ii) We can suppose in (ii) that there is a point set A on Q(F) of positive 
P p-measure, a positive number M and an integer NV such that 


1 n 
(16) — — | < M 


N jal 


on A if m= N. On replacing by n—1 and multiplying by (n—1)/n, 


—1 


j=1 


on A if n=>N-+1. Subtracting (17) from (16), 
(18) (« = 


n 


on A if n2>N-+1. By (1), x,(w)/n approaches 0 in measure as becomes in- 
finite. Then there is an integer V;=>N-+1 such that on a subset A, of A of 
positive Pr-measure 


| xn(w)/n| << Mifn = Ni. 
Hence 


(19) — — 1)/n| < 3M. 
From (18) and (19), 
| an(w)/n| <5M 
on A. The lemma can now be applied, and it shows that {x dF (x) exists asa 


Stieltjes-Lebesgue integral. This integral is the expectation of the chance vari- 
able 


* A. Kolmogoroff, Ergebnisse der Mathematik, vol. 2, No. 3: Grundbegriffe der W ahrscheinlich- 
keitsrechnung, p. 59, announced the first part of this theorem, and also the second part, under the 
assumption that the probability is 1 that the upper limit in (15) is 0. 


1934] eee 765 
| m7 | 


766 J. L. DOOB [October 
The following theorem will be needed in the application of the results of 
this section. Its proof is simple and will be omitted. 


THEOREM 4. If F(x) is defined as in Theorem 1, and if F(x) has an in- 
tegrable derivative f(x): 


F(x) = 


there is a point set A(F) on Q(F), Pr|A(F)]=1, with the following property. If 
g(x) is any function defined and continuous almost everywhere (in the sense of 
pr-measure) on the infinite interval 2 <x< and such that g(x) f(x)dx 
exists, then 


1 n 
lim — = 


no junk 
at every point w:(---,%0o,---) of A(F).* 
2. THE METHOD OF MAXIMUM LIKELIHOOD 


For each value of p in some point set E let f(x, p) be a probability density 
over the interval — 2 <x<0.f Assume that the chance variable x has a 
probability distribution whose density is f(x, p) for some (unknown) value of 
p in E. Then an important problem in statistics is that of estimating the 
true value of » by means of large samples of values of #, obtained inde- 
pendently. This is done by the method of maximum likelihood of R. A. 
Fisher{, which has supplanted the use of Bayes’ theorem. If %,---,2, isa 
sample of values of x, and if f(x, p) is the probability density of the distribu- 
tion of values of *, the probability of obtaining a sample of values x/, - - +, %n 
where x} is in a small interval with midpoint x;, is, in the limit, proportional 
to | [‘-1/(x;, p). The method of maximum likelihood takes as an approximation 
to po, the true value of p, the value p, of (or one of them if there are sev- 
eral) which makes this product a maximum. If p, approaches /» in probabil- 
ity as the samples become larger, p, is called a consistent estimate of p. A 

* The theorem will be needed as here stated. It can be stated in terms of Riemann-Stieltjes 
integration, making unnecessary any restrictions on F(x). 

{ This means that f(x, p)=0, that f(x, p) is defined for almost all values of x, is measurable and 
integrable over the x-axis, and that f_, f(x)dx=1. It is supposed that there is a chance variable x(p) 
whose values are distributed in such a way that the probability of x(p) being in any measurable point 
set A is faf(x)dx. 

t Philosophical Transactions of the Royal Society of London, (A), vol. 222, pp. 309-368, espe- 
cially pp. 309-330. The proofs given by Fisher and by H. Hotelling, these Transactions, vol. 32 


(1930), pp. 847-859, of the validity of the method of maximum likelihood (in the sense that theorems 
similar to the ones to be proved in this section hold) are not rigorous. 


1934] PROBABILITY AND STATISTICS 767 


rigorous proof will be given in this section that, under certain hypotheses, 
the method of maximum likelihood furnishes consistent estimates. 

THEOREM 5. For each value of p in a point set E let f(x, p) be a probability 
density on the infinite interval —2 <x< 0. Let x be a chance variable whose 
distribution is determined by the probability density f(x), and suppose that for 
each set of numbers x1, -- - ,%n,N=1,2,---, it is possible to find a value of 
pin E: Xn) such that 


(20) 2 Tes. 


Then if 
F(x) = 


there is a set of points A of Q(F) of total probability 1: Pp(A) =1, with the follow- 
ing properties. Let w:(---,%0, ++ +,) bea point of Nand let { pag(x1, Xan) } 
be any subsequence of {pa(xi,-- +, %Xn)} for x;=x;(w), 7=1, 2,---. Set 


(21) fal) = L.U.B. 


Suppose that f,(x)/f(x) is continuous, except possibly for a set of values of x 
of zero probability*, and that 


(22) log* | dxt 


exists. It follows 
(i) that the integral 


lim sup f(x, pa,) 
dx 


exists and is not negative; 
(ii) that if lim supn+f(x, pan) ts integrable, and if 


(24) lim sup f(%, pa,)dx S 1, 
then lim supn-«f(X, Pan) =f(x) except possibly on a set of zero probability; 
(iii) that if the sequence { f(x, pa) } converges (except possibly on a set of 0 
probability), the limit function is f(x) (except possibly on a set of 0 probability). 
* This means that the integral of f(x) over the exceptional set is 0, i.e., that f(x) =0 almost every- 
where (in the sense of Lebesgue measure) on the set. In the following integrals, in which ratios with 


f(x) in the denominator appear, we define the ratios as 1 when f(x) =0. 
Tt If £20, log* é is defined as log when ¢>1, and 0 otherwise. 


(23) 
f(x) 


768 J. L. DOOB [October 


In the application to statistical problems, it is part (iii) which would be 
customarily used. Thus, consider the problem of estimating the mean of a 
normal distribution, where the density is 


1 2 
(25) f(x, p) 


the true value of p being fo. In this case if p,, approaches any finite value, 
it is seen at once that (22) exists. Since f(x, p) is continuous in 9, (iii) shows 
that f(x, p.,) approaches f(x, po), so that ,,, converges to po, the true value. 
On the other hand, suppose that p,, converges to either + or —*. Then 
the integral (22) exists. By (iii), f(x, .,), which converges to 0 (since 
| Pa,|—+2), approaches f(x, po). This is impossible, so lim, ../.= Po, with 
probability 1. It is usual to take for the approximation p, the average 

It is evident that if Ao is the set of points w:( - - -, %0, - - of Q(F) such 
that at least one coordinate x; is in the set of values of x at which f(x) =0, 
P (Ao) =0. It will be shown that the set A of this theorem can be taken as 
the set A(¥)—Ao-A(F), where A(F) was described in Theorem 4. Suppose 
then that w:( - -.- , xo, - - - ) isin this set. 

(i) From (20) and (21), if Z.(y) is defined for every positive number ¢ as 
log y if and as log ¢ if y 


if n= N. Now since f logt (fx/f) is integrable, fL.(fy/f) is integrable (over 
the entire x-axis). Then letting m become infinite in (26), we have, from Theo- 
rem 4, 


(27) f [=e Jaz > 0. 


f(x 


As N increases, L,(fx/f) does not increase, and 


lim Li(f/f) = LAf/f), 


n+ 2 


(x) = lim sup f(x, pa,)- 


where 


1934] PROBABILITY AND STATISTICS 


Then we can go to the limit under the integral sign in (27)*, obtaining 


(28) =] dx = 0. 


Let E; be the set of values of x at which f(x) St. The integral (28) can be 
separated into integrals over E, and its complement, CE;. Doing this, we find 
that 

1 f(z) 
(29) 0 S pr(E.) log— f(x)Le 

t 


CEt 


Letting ¢ approach 0, (29) shows that pr(E£,) =0 and that furthermore the 
integral (23) exists and is not negative. 
(ii) From (i), 


Now by a well known fnnguelliyt, and using (24), 


(31) f(x) oe log f(x)dx < 0. 


There is equality in (31) only when f(x) =f(x) for almost all x (in the sense 
of pr-measure), and there is necessarily equality, by (30), so (ii) is proved. 
(iii) To prove (iii) it is only necessary to reduce it to (ii), by showing 
that, if 
f(x) = lim f(x, po,), 


s2j x)dx exists and is not greater than 1. We have 


=, 


so by Fatou’s lemmaf, f(x) is integrable over — <x< and f(x)dx <1. 


* The situation is visualized more readily when the integral is written as 


f(x) 
fi [= 


The integrand is bounded uniformly above by the integrable function L;(f;/f) and below by log ¢, so 
we can integrate term by term. 
+ Making the substitution y= F(x), the inequality needed becomes 


Se log g(y)dySlog fog(y)dy, 
where g(y) =f (x)/f(x). 


t P. Fatou, Acta Mathematica, vol. 30 (1906), pp. 375-376. 


769 


770 J. L. DOOB [October 


The treatment of the principle of maximum likelihood given above was 
for continuous distributions. The most general statement of the other ex- 
treme is as follows. To each integer n=1 is assigned a probability a(n, p) de- 
pending on pf which varies on some point set. The intrinsic conditions are 


a(n, p) = 0, p) = 1. 


For each sample of integers 7, - - - , 7, there is a value p, of p such that 


Pn) 2 po), 

j=1 j=1 
where fp is the true value of . The problem is to show (under suitable restric- 
tions on a(n, p)), that p, approaches fp» in probability. This problem can be 
treated in a similar manner to the one just treated. 

The method of maximum likelihood, when analyzed more carefully, yields 
further information. Reverting to continuous distributions, suppose that for 
each value of p in a neighborhood of po, f(x, p) is the density of a probability 
distribution. The function p,(%;, - - - , x») will be called an mth approxima- 
tion of maximum likelihood to 9 if it is defined on Q(F) (where F(x) 
= po)dx) on a set of Pp-measure 1, if 


n n 


Pn) = Po) 

jul j=l 
and if p) for fixed x, - - - ,x, has a relative maximum at p=/,. It is 
is no restriction to assume that p»=0. 


THEOREM 6. For each value of p in some neighborhood | p| <a, a1>0 , of 
p=0, let f(x, p) be a probability density in the infinite interval —~ <x<m. 
Let the true distribution of x be determined by the probability density f(x, 0). 
Sup pose 

(i) that log f(x, p) can be expressed in the form 


2 
(32) log f(x, p) = log f(x, 0) -+ + F + r(x, p),* 


where a(x) f(x, 0), a(x)*f(x, 0), B(x) f(x, 0) are Lebesgue measurable and integra- 
ble over <x< and where 


* We shall assume in the discussion of this theorem that x does not take on any value at which 
f(x, 0)=0. This means leaving out sets of total probability 0 on the x-axis and on 2(F), where 
F(x) =f" f(x, 0) dx. 


PROBABILITY AND STATISTICS 


0 
ap p) r(x, p) 


exists for |p| Sa Say, a2>0, and is continuous at p=0; 
(ii) that if 


(33) o(x) =L.U.B. 


0< ipl <a 
then f(x, 0) is integrable over <x<0*; 
(iii) that if 5(x, p) is defined by 


(34) fl, = fla, + pats) + + + pt, 


(35) in p)f(x, O)dx = 0.F 
Then 
(36) ode + f B(x) f(x, O)dx = 
Suppose that 
= f 0)dx > 0. 
Then if p,(x1, - - +, %n) ts an nth approximation of maximum likelihood to p=0, 


and if p, approaches 0 in probability: 


lim Pp(| p.| > €) = Of 


for every «>0, 


1 
(38) lim Pr(on'/2p, < = lim Pp(on'!2p, < 4) = ——— ]_ 
on (29)1/2 J 


for every constant d, uniformly ind. 


* We take this to mean that So(x)dF (x) exists so that ¢(x) can be + on a set of zero pr- 
measure. 

T Since f(x, p) is integrable over — © <x< ©, it follows from (i) that 5(x, p)f(x, 0) is also. 

t Such expressions will be taken to mean the probability that | pal >e (i.e. the Pp-measure of 
the set of those points on 2(F) where | Pr >), etc. In (37) we use Pp, the outer measure on 2(F), 
instead of Py, since we have not assumed that n(x,+- + , xn) is measurable with respect to F(x). 
Similarly, Pr will denote the inner measure on 2(F). 


1934] 771 
(37) 


772 J. L. DOOB [October 


The theorem states simply that, under suitable restrictions on the charac- 
ter of f(x, p) in p, p» will be normal for large m, with variance 1/(¢2m).* 
Since 


(39) f p)dx = 1 


for all p in the neighborhood considered, 


a(x) f(x, 0)dx + + a(x)?] f(x, 0)dx 


(40) 7 
+ f (x, p)f(x, O)dx = 0. 


Dividing through by p and letting » approach 0, we find that, in view of (35), 


(41) ioc O)dx = 0. 


Dividing (40) through by p? and letting p approach 0, we find in view of (35), 
that (36) is true. 

The logarithm of the likelihood of a value of p, obtained from 1 trials, is 
defined as 


= log f(xj, p) = log f(x;, 0) 
(42) j=1 


2 
j=1 2 j=1 j=1 


Since L,(p) has a relative maximum at ?,, 


(43) Lx (pn) = + Pn + Pa) = 0, 
j=1 j=1 j=1 
if we suppose that | p,| <dz. 

(A) If p, =0, >-}1a(x,;) =0 also, excluding possibly a set of zero probabil- 
ity on Q(F). For if p, =0, (43) becomes >-?_,a(x,;) =0 (if a set of zero proba- 
bility on Q(F) is ignored), since the hypotheses of the theorem imply that 
7,(x, 0) =0 on a set of pr-measure 1 on the x-axis. 

(B) Let m be defined by 


(44) m = ioc O)dx. 


* R. A. Fisher, loc. cit., p. 359. 
H. Hotelling, loc. cit., pp. 856-858. Through an oversight, this theorem is stated, on p. 850, 
with the variance of p,, as on. 


n n 
n n n 


1934] PROBABILITY AND STATISTICS 


Then 


1 n 
(45) Pe} lim — = = 1 


NM 


by the Corollary to Theorem 2, or by Theorem 3, so that 


(46) lim Pp | i) 2 =o 


j=1 
for any e>0.* 
(C) im — = 1, 
j=1 
by the Corollary to Theorem 2 or by Theorem 3. 
Now from (43), if 0<|»,| <a: and if the denominator does not vanish, 


(47) nil2gp, = Dates + Rn, 


on'!2 


where 


1 n 


on 


We define 2, as 0 if p,= 
Using (A), (B), (C), we shall show that 


lim Pr(| > = 0 


for every e>0. Since 


n j=1 


and since by the Laplace-Liapounoff theorem+ 


* Equation (45) expresses the fact that a certain sequence {hn} of functions on 2(F) converges 
to m almost everywhere on 2(F). Then since the sequence { | Pn! } converges in measure to 0 on 2(F), 
by hypothesis, the sequence { | Pn| hn} converges in measure to 0 on Q(F), which fact is expressed 
by (46). 

tA. Khintchine, Ergebnisse der Mathematik, vol. 2, No. 4: Asymptotische Gesetze der Wahrschein- 
lichkeitsrechnung, pp. 1-8. 


773 
1 
np 


774 J. L. DOOB [October 


(49) im Pr SA> = ix 2, 


the numerator of R, converges in measure to 0 and the denominator to 1 
as n becomes infinite. Then R, converges in measure to 0 on Q(F) as n be- 
comes infinite: 


(50) lim on'!?p, nil2 a(x;)| 2 = 


no 


for every «>0. Now suppose that on'/2p,,<d on the set Z, on Q(F). Fix e>0 
and suppose that the difference in (50) is less than ¢ on the set 


lim Pr(F,) = 1. 


Then the points of Q(F) where on/*p,, << for any constant \ are included in 
the points of the complement of F,, or in the points common to F,, and the set 
on which 


<At+e. 


1/2 


The points of — where on"! tay <A include the points where 


Dats) 


which also belong to F,,. These considerations show that (38) is true, since 
(49) is uniform in X. 

Theorem 6 requires a slight modification if the parameter p is replaced by 
several parameters, p™,---, p\. Theorem 5 evidently needs no essential 
change in this case. In Theorem 6 we replace (32) by 


log f(x, log f(x, p) 
= log f(x, 0) + Dip Pas(x)+ 4 + (x, p), 
t=1 


Bix(x) = Bes(x), 
where we take the true set of parameters as (0, - - - , 0), and where we sup- 
pose that the first partial derivatives of y(x, p™, - - - , p™) exist in a neigh- 
borhood of the origin in the r-dimensional p-space, and are continuous at the 
origin. Conditions (ii) and (iii) are modified in an obvious way, and (36) be- 
comes 


(36") f(a, + Bu(x)f(x, O)dx = 0, 


1934] PROBABILITY AND STATISTICS 775 


proved as before. If we set 


Cnr = 0)dx, 


the theorem states that the joint distribution of p,, - - - , p,, the mth ap- 
proximation of maximum likelihood, approaches normality, where the matrix 
of the variances and covariances of the p,‘” becomes 1/m times the inverse 
matrix of ||o,:||, which we assume non-singular. The proof will be sketched 
briefly. The theorem is stated in a way invariant under non-singular linear 
transformations of p™, - - - , p*. We can assume that a linear transforma- 
tion has been performed already, if necessary, reducing the positive definite 
quadratic form 


1) f = f pats) | He, Ode 
tml 


t,j=1 


to canonical form, so that 
(52) f Bij(x) f(x, 0)dx = 5;; 


where 6,; is the usual Kronecker delta. Equation (43) becomes 
ap 
and (47) becomes 


= + > Bus + pr) = 0, 


pntk) i=l j=l 


(43’) 


te Dae) + R, 
It is shown as before that R, approaches 0 in probability as m becomes in- 
finite. For large nm, the estimates ~,“ are then distributed nearly normally, 
with variances and covariances obtained from 1/m times the inverse of the 
matrix 


(47’) = 


UNIVERSITY, 
New York, N. Y. 


if 

| 

l 

thy: 

pe 
: 
4 

fi 


METABELIAN GROUPS OF ORDER "+" WITH 
COMMUTATOR SUBGROUPS OF ORDER )p”* 


BY 
H. R. BRAHANA 


INTRODUCTION 


We consider a metabelian group G obtained by extending an abelian 
group H of order p" and type 1,1, - - - by meansof moperators U;,---,U,, 
of order p from a Sylow subgroup of its group of isomorphisms. Every opera- 
tor of U={U,,---, Un} determines a partition of m and the fact that G 
is metabelian is equivalent to the requirement that no operator of U de- 
termine a partition of m in which the greatest term is greater than 2. We 
require further that H be a maximal invariant abelian subgroup of G; this 
implies that no operator, except identity, in U determines a partition of x 
with greatest term smaller than 2. Throughout the first four sections we shall 
require that every operator of U, except identity, determine the partition 
n=2+2+1+ - - - +1. Such groups for m=3 as well as the groups such that 
every operator of U, except identity, determines the partition n=2+1+ - - - 
+1 have been classified.{ In the case m=3 we found that there is but one 
group satisfying the conditions which we impose here, but the considerations 
necessary to show it indicated that extremely interesting results were to be 
found for larger values of m. 

In §1 we suppose generators of G to satisfy a set of relations of a special 
type and are able to show that the problem of the classification of the result- 
ing groups is exactly the problem of the classification of polynomials of degree 
m in a single variable x with coefficients in the modular field, mod , under 
the group of projective transformations on x with coefficients also in the mod- 
ular field. This is applied in §2 to the groups for m=4, where some obvious 
properties of the groups suggest a further analysis of the relation between 
polynomial and group. In particular it becomes apparent that the poly- 
nomial is in most cases independent of the special form of the generating 
relations used in §1; also there appears a group which belongs in the class 
but has no set of generators satisfying these special relations. In §3 it is 
shown that the classification of the groups with central of order p"~? is equiva- 

* Presented to the Society, April 6, 1934; received by the editors March 10, 1934. 

t On isomorphisms of abelian groups of type 1, 1,--- , American Journal of Mathematics, vol. 


56 (1934), p. 53. 
t On metabelian groups, offered to the American Journal of Mathematics. 


776 


METABELIAN GROUPS 


lent to the classification of matrices M+xN, where M and N are m-rowed 
square matrices with elements in the modular field, under “rational” projec- 
tive transformations on x and “rational” elementary transformations on M 
and NW simultaneously. This is extended in §5 to show the equivalence of the 
theory of groups with centrals of order p*-* with the theory of matrices 
-- + under the same set of transformations. In §4 the in- 
variant factors of M+<WN are used to discover some of the properties of the 
groups. 


1. A SPECIAL CASE 


Let the generators of H be si, s2, - - - , Sn. Let the generators of G= {H, U} 
satisfy the following relations and no others except such as are consequences 
of these: 


| —1 
Uy; = 5154, U2 = 5155, Um—151U m—1 = S1Sm+a, 
| —1 —1 
(1) U, = S253, Uz = So84, +++, Um—182U m1 = S25m4iy 
a an 
Um 51U m = S183 Sa * * 
Um = S25 m+2- 
The central of G is obviously of order p"~?, being generated by Ss, 54, - - - , Sn, 


and the commutator subgroup is of order p”, being generated by sz, 54, - - - , 
Sm42. If U contained an operator of type I, i.e. one which determines the 
partition m=2+1+ --- +41, then {s:, s:} would contain an operator 5,527 
permutable with some operator of U.* Any operator of U may be written 
U'=UU," - - - U,,*=, Transforming by U’ we have 

z zk k a, a 
The commutator in the above must be identity and we thereby obtain the 
following system of congruences linear and homogeneous in the k’s, 


xk, + ak» = 0, 
ki + xke + dek» = 0, 
ke + xks + askm = 0, 


Rm—2 + ¢Rm—1 + Om—1km = 0, 
Rm—1 + (x + am) Rm =0 


* Cf. the last reference where the question is considered for m=3. 


777 


a 
|_| 
ig’ 
| 
a 
| 
| 
| 
| 
| 
] 
. . . . | 
| 


778 H. R. BRAHANA [October 


The condition that the system have a solution is that the determinant of the 
matrix of coefficients be zero, which is 


(2) + — + — --- + (— 1)™"a, = 0, mod p. 


This condition will not be satisfied by any x if the polynomial in (2) contains 
no linear factor in the modular field. Since there exist irreducible congruences 
of any degree it follows that a;, a2, - - - , @m in (1) may be chosen so that U 
contains no operator of type I. 

A different choice of generators 51, 52,---, 5, and U;, Us,---, Um in 
the group G would be expected to result in a different congruence (2), it being 
understood that the new generators satisfy relations similar to (1) in that the 
commutator of U; and 5s; is the same as that of U;_1 and 1, i=2, 3, - - - , m. 
We undertake to show first that if s; and s, are left unchanged and 
U;, U2, - - - , Um are changed to any set which satisfy a set of relations similar 
to (1) then the congruence (2) remains unchanged. 

It is evident that, since U is of order p” and the commutator subgroups 
arising from transformation of s, and sz: by U are both of order p”, the choice 
of U; determines all the U’s thereafter. We shall prove our statement by 
proving that (2) is unchanged 

(a) by replacing U; by Ui", 

(b) by replacing U; by U2, 

(c) by replacing U; by the product of U; and U; each of which gives the 
original congruence (2) when used for U;. 

If we replace U; by U/ =U;,* the operators U7, U3, ---, Um’ may be 
determined successively and it is obvious that they are U/ =U,*. Conse- 
quently the commutator of U,/ and s; is 


This operator expressed in terms of the preceding commutators is 


102 ‘Om 
Sg Sq ** * 


This proves the statement for the case (a). 
If we replace U; by Ui = U2, we have U/ =U i4:,i=1, 2, - - -,m—1. The 
commutator of =U, and s; is 


(3) Sues 


and therefore we must have U,’ =U,“U.™ - - - U,,°™. Then the commutator 
of and s; is 


a,k agk Oy, k 
S3 Sq * Sm+2- 


1934] METABELIAN GROUPS 779 


This may be expressed in terms of the preceding commutators Si-1=Si, 
i<m+2 and s‘,42 as given by (3). Or, if we evaluate 


in terms of 53, 54, - - - , Sm42, we have (4). This proves our statement for (b). 

From the above facts it follows that (2) remains unchanged if UV; is re- 
placed by U/ = U;**. Since the commutators are all permutable among them- 
selves it follows that if U; and U; are such that each leaves (2) unchanged 
when used for U; in (1), then the product U{ = U;U; used for U; will deter- 
mineaset U/, Uj, ---,U,! andaset of commutators sj, s/, - - - ,Sm42such 
that the commutator of U,, and s, will be expressible as 


Sg Sq *** Sm+e2- 


This last relation holds regardless of whether or not the operators 
U!, Ud,--+, Us are independent, though we are interested only in the 
case where they are independent, as otherwise the U’’s would not serve with 
H to generate G. This restriction is contained in (c). As a result of these con- 
siderations we have 


(5) The congruence (2) determined by a set of generators s;, 52,°-* +, Sn, Ur, 
Us,- +--+, Um of G is independent of the choice of the U’s provided they are 
chosen from {U,, Us, ---, Um} and the commutators satisfy the relation 


We consider next the effect of a new choice of generators of H. An essen- 
tial of the relations (1) is that but two of the generators of H are outside the 
central of G, and since the congruence (2) depends on the form (1) it is obvi- 
ous that a change in generators of H must be a change to a set of which but 
two are outside the central of G. If s; and s, are left fixed and s/, sf, *--, 50 
are chosen from {53, ss, - - - , Sn} the change amounts to a renaming of opera- 
tors in the central and the commutator subgroup and does not affect the rela- 
tions connecting operators. It is on these relations that (2) depends. More- 
over, a choice 


si = 


n kj 
se [s; 


i=3 


k 
J 
rug 
fer 
| 
44 
and 
| 
4 | 
= 
| 
ba 


780 H. R. BRAHANA [October 


has no effect on (2), since the commutator of U; and sj is the same as that of 
U; and s;. Consequently we need consider only the effect of a choice of s/’ 
and s/ from the group {s1, s.}. We proceed to prove the following theorem: 


(6) If sf =sy*s2 and sf =5,°s24 where s1,---, Um satisfy (1) and determine 
the congruence (2), then si, sf, 53, - - - , Um satisfy a set of relations similar to 
(1) and determine a congruence which is obtained from (2) by subjecting x to the 
transformation x = (ax’+b)/(cx’+d). 


To prove the theorem we need consider only the special cases 
(a) ! = Se, and x = ax’, 

(b) = = Se, and = x’ +1, 

(c) = 5, and x = 1/2’. 


It is not necessary to record the details of the computation here, for the opera- 
tions are all rational. The two transformations, one on the generators of G 
and the other on the variable x, determine in each case the same transforma- 
tion on the congruence (2). Any transformation on the generators of {s1, s2} 
is a product of transformations of the above types; corresponding to it will 
be a product of transformations of the three types above on x. The matrices 
of the two products will be identical. 

It results from the above considerations that the problem of the classifica- 
tion of groups whose generators satisfy (1) is exactly the problem of the classi- 
fication of congruences (2), which have no roots in the modular field, under 
the group of projective transformations on the variable with coefficients in 
the modular field. 


2. THE GROUPS G FOR m=4 


The indicated classification of the congruences (2) has been carried out 
for m=3 and m=4.* These two cases present striking differences and the 
latter points the way to the results to be expected for a general m. 

When m=3 and the left-hand side of (2), which we shall denote hereafter 
by (x), contains no linear factor in the modular field, then f(x) is irreducible. 
This is not always true when m=4. All irreducible cubics are conjugate under 
the linear homogeneous group with coefficients in the modular field, and con- 
sequently any two groups G with a given H and m=3 are simply isomorphic. 
It is obvious that not all quartics with no linear factor are conjugate under 
that group, and further that not all irreducible quartics are conjugate under 


* On cubic congruences, Bulletin of the American Mathematical Society, vol. 39 (1933), pp. 962- 
969; and Irreducible quartic congruences, also offered to the same Bulletin. 


1934] METABELIAN GROUPS 781 


it, for a necessary condition for conjugacy is that their absolute invariants 
be the same. The identity of the absolute invariants is sufficient for conjugacy 
under the general projective group but not for conjugacy under its subgroup 
whose coefficients are in the modular field. If a given quartic is irreducible 
and a second quartic is conjugate to it under the general projective group but 
not under its “rational” subgroup, the second quartic is the product of two 
quadratic factors, one of which is irreducible. 

When m=4 there are +1 distinct groups G whose generators satisfy (1). 
They correspond to +1 quartics none of which has a linear factor. We are 
not interested here in making the count of the polynomials, but rather in 
comparing groups corresponding to different types of polynomial and inter- 
preting properties of the polynomials in terms of properties of the groups. 

Let us denote the polynomial by f(x). Let us consider an f(x) of degree 4 
which is the product of two irreducible quadratics. There exists a group G 
whose generators satisfy (1) where the a’s are the coefficients of f(x). For the 
sake of simplicity let us suppose that 


= — — Aa) 


where A; and dz are not squares. There is a group G’ determined by H and 
four U’s which satisfy the relations 

= sis4, = = UT's, U4 = Siss®, 

(7) = $253, = Us's2U3 = S255, 4 = SoS¢. 

The generators which were selected for G’ do not satisfy (1), nevertheless the 
condition that {U,, - - - , Us} contain no operator of type I is readily seen 
to be that f(x) have no linear factor. Moreover, it can be shown that if \; and 
A: are distinct, generators of G’ can be chosen which do satisfy (1). Such a 
set is obtained by taking U/ = U,U;3, in which case the resulting polynomial 
is f(x). Therefore the two groups G and G’ are simply isomorphic. From rela- 
tions (7) it is easy to see that G’, and consequently G, contains two subgroups 
{H, Ui, Uz} and {H, Us, U,} of order p"*? each with commutator subgroup 
of order p”. Looking at generators of G it is obvious that G, and consequently 
G’, contains subgroups of order p"*+? with commutator subgroups of order p?. 
These facts are dependent on the condition that f(x) is the product of two 
distinct irreducible quadratics. If this condition is not satisfied there are two 
possibilities: (a) f(x) is irreducible and then G contains no subgroup of order 
p"** with commutator subgroup of order p?; and (b) f(x) is the square of 
an irreducible quadratic, and G does not contain two subgroups of order p"*+? 
with commutator subgroups of order p? and with commutator subgroups dis- 
tinct except for the identity. This last fact may be seen readily by consider- 


he 


| 
| 
] 
te 
if 
| 
| 
TT 
| 
| 
| 
| 
| 
4 


782 H. R. BRAHANA [October 


ing a set of generators of G’ which satisfy (7) where \:=):. Such a group 
exists and determines the polynomial f(x) =(x%?—d,)*. Every subgroup 
{H, U’, U’’} has a commutator subgroup of order p? or of order p*. G’ cannot 
then be simply isomorphic with G whose generators satisfy (1) and whose 
polynomial is f(x). 

Granting that there are +1 conjugate sets of quartics which have no 
linear factors, we are able to distinguish p+2 types of group G for m =4 which 
satisfy the conditions of the introduction. Of those there are +1 which come 
under the special case of §1. We are able to separate these +2 groups into 
three types by a consideration of their subgroups. Those with no subgroups 
of order p**? with commutator subgroup of order p” correspond to irreducible 
quartics. Those with such subgroups but also with subgroups of order p**? 
with commutator subgroup of order * correspond to reducible quartics. The 
one with no subgroup of order p*** with commutator subgroup of order p* 
corresponds (not in the sense of §1) to the square of a quadratic. 

An interesting question is that of the existence of some subgroup or set 
of subgroups by means of which we may distinguish among the (p+1)/2 
groups whose quartics are irreducible and the (p—1)/2 groups whose quartics 
are products of two distinct quadratics. The question looks sufficiently inter- 
esting to warrant our posing it in detail for the simple case where p=7. The 
four conjugate sets of irreducible quartics are represented by 


(a) + 54 +2 =0, 
(b) x*+ 627+ 42+2=0, 
(c) + 2x+3=0, 
(d) +4*2+4=0. 
Groups of order 7**4 corresponding to these quartics are generated by opera- 
tors satisfying (1) where 
= Sisk, 
Uz's2U4 = S28, 
and s, takes the respective forms 
(a) Se = (b) Se = (Cc) Se = (d) Se = 
All of these groups are identical with respect to the following: 
the order is 7"*+4; 
they are metabelian; 


the central is of order 7*~; 
the commutator subgroup is of order 7'; 


1934] METABELIAN GROUPS 


G/H is abelian, of order 7‘ and type 1, 1, 1, 1; 
every subgroup of order 7+! has a commutator subgroup of order 7”; 

no subgroup of order 7*** has a commutator subgroup of order 7?; 

every subgroup of order 7"+* has a commutator subgroup of order 7%. 

It would seem that there is a possibility of difference in the numbers of 
subgroups of order 7*+? with commutator subgroups of orders 7’ and 74. How- 
ever it seems extremely unlikely that such differences could correspond to the 
distinction implied by the differences in value of the absolute invariant of the 
corresponding quartics, especially since the absolute invariant does not dis- 
tinguish between an irreducible and a reducible quartic. 

While it is true that two groups which differ in the number of subgroups 
having a given property cannot be simply isomorphic, it is not to be assumed 
that the converse is true. There is no compelling reason to expect the non- 
isomorphism of two groups to be reflected in properties of their subgroups. 
Nevertheless, the author is not aware of any prior example where the non- 
isomorphism of two groups is not easily deducible from a consideration of 
their subgroups. The number of occasions where a number-theoretic argu- 
ment is indispensable in the theory of groups is small, although the occasions 
themselves are crucial as is evidenced by the amount of the theory that de- 
pends on the existence of primitive roots in a Galois field. 


3. Two GENERAL THEOREMS 


The statements of the last section for the case m=4 can all be established 
easily for that special case. The relation between properties of the polynomial 
f(x) and properties of the group and the obvious direction in which a generali- 
zation of the results should proceed suggest a closer scrutiny of f(x). This 
polynomial was obtained as the condition that U contain no operator of type 
I but it is obviously more important than that would imply. Also it seemed 
to be connected with a certain selection of the generators of U, but the last 
section has shown that it appears when the generators do not satisfy the con- 
ditions (1). 

In this section we shall generalize the situation in §1 and we shall furnish 
incidentally the necessary proofs for the statements of §2. We require now 
only that G be metabelian, that the central and commutator subgroups be 
of orders p*-? and »”. We may assume in that case that a set of independent 
generators of H is chosen which contains m independent operators of the 
commutator subgroup and m—2 independent operators of the central. We 
also still require the U’s to be of order » and permutable. The generalization 
consists in not requiring the generators of G to satisfy the relations (1). Under 
these conditions an independent set of relations on generators of G is com- 


| 
P| 
| 

4 


783 
| 
» 
| 
py 


784 H. R. BRAHANA [October 


pletely described by a pair of m-rowed square matrices. We associate one 
matrix M with the operator s, and the other matrix V with s2. We let the ith 
row of each matrix correspond to the generator U;, and we let the jth column 
of each matrix correspond to 5;42, where 54, - Sm42 are independent 
generators of the commutator subgroup of G. The elements in the ith row 
and the jth column of M and WN are the exponents of s;,2 in the commutator 
of U; with s; and s2 respectively. For example, M and N for relations (1) are 


a, de 


In the case where the U’s can be separated into sets each set satisfying (1), 
of which relations (7) describe the simplest case, the matrix M takes the form 


where M; is of the above form and all other elements are zeros. In this case 
also N is the identity matrix. 

The congruence (2) is immediately recognizable as the condition for the 
vanishing of the determinant |M+«N]|. The condition that the U’s be all 
of type II is that |M+xN| have no linear factor in the modular field. The 
general case requires no further argument in these respects. 

A change of generators of the commutator subgroup amounts to replacing 
4, ---,m+2, by 

sf = ++ 
where C, the matrix of the c;;’s, is non-singular. The effect on M and N is to 
replace them by MC-' and NC" respectively. Likewise a change in genera- 
tors of U is equivalent to replacing U; by 


U{ = U; 


where D is also non-singular. The effect of this is to replace M and N by DM 
and DN respectively. The theory of the groups described in the introduction 
is therefore equivalent to the theory of pairs of matrices with elements in a 


01 0-:---0 1 0 
0 01-:--0 0 1 0---0 
M=|--------f N=z=i 00 1---0O |. 
M, 
Mi 


1934] METABELIAN GROUPS 785 


modular field.* Applying the theory of pairs of matrices we have the following 
theorem: 


(8) Two groups satisfying the conditions of the introduction and determined 
by M, N and M’, N’ are simply isomorphic if and only if M+xN and M'+xN’ 
have invariant factors which are conjugate under some operator of the projective 
group of transformations on x with coefficients in the modular field. 


We proceed to our second theorem. We consider a group G whose genera- 
tors satisfy (1) and determine the congruence f(x) =0. We suppose that G 
contains a subgroup G’= {H, Vi, - - - , Vm} with commutator subgroup of 
order p”’. Let the congruence determined by G’ be f’(x) =0. We wish to prove 
that f’(x) is a factor of f(x). Let x: be a root of f’(x) =0.+ Then by means of a 
set of linear homogeneous congruences similar to that preceding (2) x de- 
termines a set of numbers - - ,Jm such that Vi is per- 
mutable with sis2. The V’s are expressible in terms of the U’s and conse- 
quently the /’s determine a set of numbers fh, ko,---, &m such that 
U,4U," - - - U,,*™ is permutable with ss2. Consequently, x, is a root of 
f(x) =0. Hence, 


(9) If G determines the congruence f(x) =0, if G contains a subgroup G’ of 
order p"+™ with commutator subgroup of order p™ , and if G’ determines the 
congruence f'(x) =0 where the s; and s_ used to determine f'(x) are those used to 
determine f(x), then f'(x) is a factor of f(x). 


These two theorems establish and generalize all the unsubstantiated state- 
ments of §2. The first determines a canonical form for the generating relations 
of G and the second interprets the irreducible factors of f(x) in terms of sub- 
groups of G. 


4. CLASSIFICATION OF THE GROUPS G AND DISCUSSION OF PROPERTIES 


It has been shown elsewheret that m cannot be greater than 2” —4, if U 
contains an operator of type II and no operator except those of types I and 
II. Since here we require every operator of U to be of type II it is obvious 
that m is limited by the order of the commutator subgroup. We must have 
m<n—2. To find all the groups for a given m we may first determine all 
the conjugate sets of polynomials f(x) of degree m under the “rational” linear 
fractional group. There exists at least one group for each conjugate set. If 


* For the theory of pairs of matrices, cf. Dickson, Modern Algebraic Theories, 1930, p. 112. 
t xis not in the modular field. We beg to be excused from interpreting sis: and V;4V4-- - Vin’. 
This, however, does not affect the argument. 

t On metabelian groups, loc. cit. 


biG 
bar! 
a 
4 
aig 
Fi 


786 H. R. BRAHANA [October 


f(x) is irreducible, or if the irreducible factors of f(x) are relatively prime, then 
there exists but one group for the conjugate set to which f(x) belongs, for the 
invariant factors of M+<«WN are f(x), 1, 1, - - - . Let us suppose that f(x) has 
one irreducible factor f:(x) which is repeated, so that f(x) = [fi(x) ]"fo(x) where 
the factors of f2(x) are relatively prime and prime to f;(~). Then the invariant 
factors of M-+«N, having the property that each is divisible by all those 
which follow it and being such that their product is f(x), may be selected in 
as many ways as there are partitions of r. The number of such groups is 
therefore 0(r). In general, 


(10) If f(x) has the distinct irreducible factors f,(x), - - - , fe(x) and they ap- 
pear to the powers 1, - ++, t%, then the number of distinct groups G determined 
by f(x) is equal to the product 11%_10(r;) of the numbers of partitions of the r;’s. 


Let us consider a group G and its corresponding polynomial f(x) where 
the irreducible factors of f(x) are relatively prime. The invariant factors of 
M-+<N are f(x), 1, 1, - - -. Generators of U and of the commutator sub- 
group may be chosen so that the transformed determinant M’+2xN’ is in 
canonical form. If this is done the generators of G satisfy relations (1) where 
the a’s are the coefficients of f(x). Looking at the present case from another 
point of view let us suppose the irreducible factors of f(x) to be fi(x), - - - , fx(x) 
of degrees m, - - - , m,. Let us consider k sets of U’s, Ui, - - - , Vim, each set 
satisfying relations (1), the resulting commutator subgroups being distinct, 
all the U’s being permutable, and the a’s in (1) for the ith set being the coeffi- 
cients of f;(x). The group generated by H, Un, - - - , Uim, obviously deter- 
mines the congruence f(x) =0, and since its irreducible factors are relatively 
prime, the invariant factors of M+<xN are necessarily f(x), 1, 1, - - - . This 
group is therefore the same as the group described at the beginning of the 
paragraph. Consequently, when the irreducible factors of f(x) are relatively 
prime, G contains a set of & subgroups of orders p"+™,7=1, - - - ,k, with com- 
mutator subgroups of orders »™, and G contains no other subgroup of order 
p**« with commutator subgroup of order p* except such as are obtained by 
combining these. In fact these subgroups are characteristic. 

The essential condition on G in order that it be possible to write the two 
sets of generating relations made use of above is that the invariant factors 
be all unity except one. This may still hold if some or all of the irreducible 
factors are repeated. In that case, however, there will exist subgroups of order 
p*t« with commutator subgroups of order p* where a is not one of the m,’s, 
or a combination of them. Let us suppose that f(x) = [fi(x) ]", where f(x) is 
irreducible and of degree m, and let us suppose further that the invariant fac- 
tors of M+<xN are f(x), 1, 1, - - - . Now let us consider two sets of (r—1)m 


i 


1934] METABELIAN GROUPS 787 


and m, U’s respectively. Let the first set satisfy relations (1) where the a’s 
are the coefficients of [f;(x) }"-!. Let the second set determine with H a group 
whose commutator subgroup is of order p™*, and let the commutator sub- 
group determined by the two sets be of order p”. This can obviously be done 
by selecting the commutator of U,, and s; from the group generated by the 
rm, commutators which precede it and not in either of the groups generated 
by the first (ry —1)m, or the last m, commutators. Clearly this commutator of 
U,, and s; can be chosen so that the invariant factors of M+<xN are f(x), 1, 
1,- ++, since it is simply a question of requiring the canonical form of 
M-+<N to have certain coefficients and there is so much freedom in the choice 
of the commutator. From this it follows that the group G contains at least 
one subgroup of order p"*+* with commutator subgroup of order p* where 
a=km, and k is any number from 1 to r. 

If in the above case we selected the commutator of U,, and s; in the group 
generated by the preceding m, commutators and selected it so that the con- 
gruence determined by H and the set of m, U’s was fi(x), we should have the 
canonical form for M@+xN with invariant factors [f,(x) fi(x), 1,1, ---. 
In this case G contains two groups of order p*+™ with commutator subgroups 
of order p™. The two sets of U’s which determine these groups give with H 
a group of order p"+?™ with commutator subgroup of order p?™ none of 
whose subgroups of order p*+™ has a commutator subgroup of order p™*?. 

The effects on G of an increase in the number of repeated factors of f(x) 
or of the invariant factors of M+<xWN different from unity can be determined 
by considerations similar to the above. Rather than pursue this further we 
shall give a brief description of the groups G for m=6. We omit m=5 because 
in that case f(x) could have no repeated factors. 

When m=6, f(x) may be (1) irreducible, (2) the product of an irreducible 
quartic and a quadratic, (3) the product of two irreducible cubics, or (4) the 
product of three quadratics. 

Case (1). There are as many groups as there are conjugate sets of irreduci- 
ble sextics under the “rational” projective group, a number as yet undeter- 
mined. None of these groups has a subgroup of order p**+* with commutator 
subgroup of order p*, a<6. 

Case (2). The quartic may be transformed into one of (p+1)/2 depending 
on the value of the absolute invariant. The operator of order 27, 7=1, 2, which 
transforms the quartic into itself transforms its roots into their p”/‘th powers 
and consequently transforms every element in the Galois field determined by 
it into its p?/‘th power. The roots of the quadratic are in that GF(p*) and the 
quadratic is also transformed into itself. Therefore for each of the (p+1)/2 
quartics there are as many groups as there are irreducible quadratics belong- 


fir 
en 
ba 
; 
| 
vit 
id 


788 H. R. BRAHANA [October 


ing to the modular field. This number is (—1)/2. The number of groups 
is p(p?—1)/4. Each of the groups has generators which satisfy relations (1). 
Each has subgroups of orders p**? and p*** with commutator subgroups of 
orders p? and /‘ respectively, and no other subgroups of order p"** with com- 
mutator subgroup of order p*. 

Case (3). One of the cubics can be transformed into a given irreducible 
cubic and no further specialization may be made. The other cubic may then 
be any one of p(p?—1)/3, one of which is the first one. There are therefore 
(p?—~+3)/3 groups of this kind. All but one have generators which satisfy 
relations (1). The odd group contains no subgroups of order p"*+* with com- 
mutator subgroup of order p*. One other contains one subgroup of order p"** 
with commutator subgroup of order #', and all the others contain two. 

Case (4). One of the quadratics can be transformed into («?—A) and a 
second into one of (p+1)/2 quadratics which are taken one from each of the 
conjugate sets of quadratics under the group which leaves (x?—)) fixed. Hav- 
ing decided which of the three quadratics are first and second and having 
transformed them to the desired form, no other simplification is possible. 
Hence, the third quadratic may be any one of p(p—1)/2. If the three quad- 
ratics are the same, then we may suppose f(x) to be («?—))* and there are 
three groups: one, corresponding to the invariant factors f(x),1, 1,---, 
which contains subgroups of orders p"*+? and p"*+4 with commutator subgroups 
of orders p* and p* respectively; one, corresponding to invariant factors 
(x?—d)*, (x?—X), 1, 1, - - - , which contains subgroups of the first type but 
none of the second; and one, corresponding to invariant factors (x?—n), 
(x?—d), (x?—d), 1, 1, - - - , which contains no subgroups of either type. If 
two of the quadratics are the same and the third is distinct from this, we may 
transform the repeated one into (x?—X) and then the third may be any one 
of (p—1)/2. For each of these there are two groups, according as one or two 
of the invariant factors of M+<xN are different from one. The two groups 
corresponding to the same f(x) are again distinguished by their subgroups of 
order p***. There are p—1 of these groups. If the three quadratics are dis- 
tinct, f(x) can be reduced to one of (p—1)(p?— p—4)/4 forms, but the reduc- 
tion can be made in more than one way since it involved the selection of a 
first and a second quadratic. A different selection of the first and second 
quadratics may or may not change f(x), depending on the relations of the 
three quadratics. We shall not make the count of the number of groups, but 
shall note that for each such f(x) there is but one group, and that each such 
group contains three and only three subgroups of order p+? with commutator 
subgroup of order p? and that it contains subgroups of orders p"*? and p*** 
with commutator subgroups of orders p* and p* respectively. 


METABELIAN GROUPS 


5. A GENERALIZATION 


There are two obvious directions in which the results so far obtained may 
be generalized. The theory of pairs of matrices as expounded by Dickson 
(cf. the reference above) does not require the matrices to be non-singular, 
whereas the condition that U contain no operator of type I does require that 
M and N be non-singular. If U contains an operator of type I we have seen 
that M+<xN contains a linear factor. Obviously, there exists in that case a 
transformation on x which transforms one of the roots of f(x) =0 to zero, and 
such a transformation replaces M and N by M’ and N’ where M’ is singular. 
The new generator s;’ is therefore permutable with one of the operators of U. 
If f(x) has a linear factor which is repeated, then after the above transfor- 
mation more than one of the U’s will be permutable with s,’. If f(x) has two 
distinct linear factors, then a transformation on x will put one of the roots of 
f(x) =0 into zero and another into infinity, in which case both M’ and WN’ will 
be singular.* We shall not pursue this question at this time but shall consider 
another extension. 

Let us remove the restriction that the operators of U be of type II, as- 
suming that U contains an operator of type K where the type K is distin- 
guished by the fact that the operator determines the partition 


in which there are k 2’s. The central of G will then be of order at most p*-*. 
On the other hand if the central of G is of order p*-* and U contains no opera- fe 
tor corresponding to a partition of m with a greatest term greater than 2, then 
U can contain no operator of type J where 7 >&. We shall require further that 
the commutator subgroup of G and that U be of order p”. The considerations 
of the first paragraph of this section indicate that groups satisfying these re- 
strictions are of fundamental importance. 

The U’s are assumed to be permutable as before. Now generators of H 
can be selected so that »— of them are in the central and m of those are in 
the commutator subgroup. The relations among generators of G are then 
completely described by & matrices M;, M2, - - - , M,, one corresponding to 
each of the non-invariant generators si, 52, - - - , Sx, and defined exactly as 
M and N in §3. The condition that all the operators of U be of type K is ob- 
tained by considering the commutator of - - - and U,4U,» 
-U,,*». This leads to a system of linear homogeneous congruences in 
ki, ka, km, the condition for whose solution is - - - 


* Compare with the classification of groups for m=3, On metabelian groups, loc. cit. 


4 
{ 
Ag 


790 H. R. BRAHANA [October 


+x,M,| =0, mod p. There will be no operator of a lower type in U if this 
polynomial f(x:, x2, - - - , x.) contains no linear factor. 

Exactly as before we may restrict our attention to changes in f(x, - - - , xz) 
due to changes in the generators which satisfy the following conditions: the 
U’s are selected from the group U; m of the independent generators of H 
are in the commutator subgroup; and & of the independent generators of H 
are in the group {s:, s:, - - - , s:}. The first condition makes certain that the 
U’s are permutable and disregards all transformations that leave the M’s 
simultaneously invariant; the second is necessary if the M’s are to remain 
square matrices; and the third is necessary if the number of M’s is to remain 
unchanged. A selection of new sets of generators of U and the commuta- 
tor subgroup subject to these conditions results in the transformation 
M{! =CM\D, where C and D are non-singular. This does not change the poly- 
nomial f(x, ---, 2,), and does not change the invariant factors* of the 
matrix --- +x,M;,. 

For the effect of changes in the generators of {s:, - - - , s,} let us consider 
the transformation 

The matrix Mj is obtained by considering the commutators of s/ with 
U,,---, Um and is obviously --~- +a;,M,. The operator 


- $,7* expressed in terms of the s’’s is - - - where the 
x’s are obtained from the x’’s by a linear transformation whose matrix is the 
matrix A of exponents a;; above. After this transformation we have the matrix 
ai Mi in place of M=x2,Mit+ --- +x.M;,. If now 
instead of carrying out the transformation on the generators we subject the x’s 
to the transformation just described, the matrix M becomes, when the terms 
in x/ are collected, 


ai + + +++ + + 22 + + + 


which is x{ M{ Mj + ---+«{M;{. Therefore, 


(11) Two metabelian groups G={H, U} and G’={H, U’} in which both 
the U’s and the U"’s are permutable and of order p, and such that generating rela- 
tions are defined by sets of matrices Mi,---,M,and Mi,---,M¢ respec- 


* The term “invariant factor” is used here in a sense which is merely an extension to & matrices 
of the definition given by Dickson, loc. cit., p. 104, for two matrices M and N. It is clear that the 
polynomials are homogeneous and are left unchanged when the matrix M=mMit+--+ +xMa 
is replaced by CMD. 


1934] METABELIAN GROUPS 791 


tively, are simply isomorphic if and only if the invariant factors of x:M,+2x.M, 
+---+a,M, and x{ ---+x/M are conjugate under some 
operator of the linear homogeneous group on the variables x, + - + , Xx. 


We return again briefly to the ideas of the first paragraph of this section. 
In the situation we have been considering we have assumed that all the opera- 
tors of U were of type K and this implies that each of the M;,’s is non-singular 
and also that f(x, - - - , x.) contains no linear factor. If f(x, - - - , x.) con- 
tains a linear factor, then U contains an operator of type J where 7 <&. It is 
obvious that under those circumstances there are several possibilities. If the 
operator in question is of type I it would be possible to select the generators 
si, :--,s¢ andU/,---,U,/ so that all but one of the matrices M/ would 
be singular. If however the operator were of type (K —1), generators of G 
could not be selected to make more than one of the M/’s singular. Thus the 
condition that f(a, - - - , x.) have a single linear factor in the modular field 
seems to permit the possibility of many distinct groups. This seems to lead 
to a large subject in the theory of forms on which not much is to be found in 
the literature. Dickson has considered* the types of forms that can be written 
as determinants with linear elements. The classification of these forms in- 
volves the theory of modular invariants. The classification of the groups in- 
volves considerations beyond the modular invariants since the latter do not 
take account of the invariant factors of M. 


6. CONCLUDING REMARKS 


In conclusion we wish to point out the relation of our investigations to the 
practically impossible problem of the classification of groups of order p*. One 
important sub-class of these groups, which from some points of view may be 
considered as the most elementary, is made up of those groups whose opera- 
tors are all of order #, in other words, those groups which are conformal with 
the abelian group of type 1, 1, - - - . It is with groups of this class that we 
have been concerned. Every such group contains a maximal invariant abelian 
subgroup of type 1, 1, - - -. Of these groups the most elementary are the 
metabelian groups. Our plan of classification is to determine all of those such 
that U contains at least one operator of type K and no operator of type 
greater than K. In the cases where & is 1 or 2 this plan of classification never 
allows a given group to appear in more than one class, but for k >2 it is neces- 
sary to look out for repetitions. For example, if =3 and m=2, G contains a 
maximal invariant abelian subgroup Us, 5s, +, Sn} of order 
and the operators si, 52, 53 which now serve as the U’s are all of type I or . 


* These Transactions, vol. 22 (1921), p. 167. 


4 
i 
| 
7; 
rey 
if 
at 


792 H. R. BRAHANA 


type II. However those repetitions and all others are avoided by insisting 
that m be at least as great as k. 

But we are not yet willing to consider all metabelian groups which are 
conformal with the abelian group of type 1, 1, - - - , for a restriction which 
has been important throughout is that the U’s be permutable. It is obvious 
that U’s can be chosen, in the groups which we have considered, which are not 
permutable, for example, U/ =s,U; is not permutable with U2. Consequently, 
the removal of that restriction will not only increase our difficulties in manag- 
ing the groups but will increase greatly the number of repetitions. It appears 
quite likely that the best method of procedure when the restriction in ques- 
tion is removed is to arrange the groups according to the differences in orders 
of the commutator subgroup of G and the commutator subgroup arising from 
transformation of H by U. In all the groups we have considered these orders 
are equal; the question as to whether or not the converse is true does not seem 
to have an obvious answer. 


UNIVERSITY OF ILLINOIS, 
Urpana, ILL. 


SUFFICIENT CONDITIONS FOR THE PROBLEM OF 
BOLZA IN THE CALCULUS OF VARIATIONS* 


BY 
MAGNUS R. HESTENESf 


1. Introduction. Let the end points of the arcs 
(1:1) = yi(x) (wt SxS x2; i=1,---,n) 


be denoted by the symbols y,---, yi) and y?,---, y2). The 
problem to be considered is that of finding in a class of arcs (1:1) and sets 
(a) =(a1,- ++, @,) satisfying the differential equations and end conditions 


(1:2) y’) = 0 (@=1,---,m<n), 
(1:3) = x(a), = y*(a) (s = 1, 2) 


one which minimizes a functional of the form 


J = 0(a) I(x, y, y’)dx. 


This problem was first formulated by Bolza (II, p. 431){ and will be called 
the problem of Bolza. The formulation here given is due to Morse and Myers 
(VI, p. 236). Of special importance is the case in which the end conditions 
(1:3) are of the form 


x? = (p41, ar), y? y? (p41, 


(1:4) 


and the function 6(a) is of the form 
O(a) = , — O7(ap41, 


The latter problem will be called the problem of Bolza with separated end con- 
ditions. In the proof of Theorem 9:2 below it will be shown that the two 
problems are equivalent not only in the sense that each can be transformed 
into one of the other type but also in the sense that the theory of the one can 
be deduced from that of the other. 

* Presented to the Society, June 23, 1933, and March 31, 1934; received by the editors January 
25, 1934. 

t National Research Fellow. A considerable portion of the results here given were obtained while 


the author was a Research Assistant to Professor Bliss at the University of Chicago. 
¢ Roman numerals in parentheses refer to the list of references at the end of this paper. 


793 


Sy 
ra 
4 
; 
| 
1 
pr | 
4 
at 
5 | 


794 M. R. HESTENES [October 


Sufficient conditions for a minimum in the problems of Bolza were first 
given by Morse (VIII) and later by Bliss (IX) and Hu (XTX). However, the 
normality assumptions, which they make, prevent these conditions from be- 
ing applicable without further modification to the problem of Mayer (III; 
XI), to the case in which the functions ¢, contain no derivatives, and to a 
number of other problems. Sufficient conditions for the problem of Mayer 
have been deduced by Bliss and Hestenes (XVII; XVIII) who make similar 
restrictive normality assumptions. In §9 below we give for the first time sets 
of sufficient conditions for the problem of Bolza containing no normality as- 
sumptions whatsoever. We merely assume the existence of a set of multipliers 
of the form \»=1, Ag(x) with which the arc g under consideration satisfies 
suitable analogues of the usual sufficiency conditions. It is clear that the re- 
sults of the present paper are applicable at once to the problem of Mayer and 
thus unify the problems of Bolza and Mayer so that they are equivalent not 
only in the sense that each can be transformed into one of the other type 
but also in the sense that the theory for the one can be deduced from that 
of the other without further modification. The results of this paper also show 
that the classical problem of Mayer can be considered as a problem of 
Lagrange with one variable end point (cf. I, p. 224). Moreover by the use of a 
device given by Bliss (V, p. 703) the results here given can be applied to the 
case in which the functions ¢g contain no derivatives. One obtains thereby 
an extension of the results given by Bower (XX). 

In order to obtain the sufficient conditions here given we derive in §4 a 
new analogue of the necessary condition of Mayer for the problem of Bolza 
with separated end conditions. A similar condition has been given by Currier 
(XII, p. 699) for parametric problems without differential side conditions and 
with special end conditions. The methods of Currier, however, do not seem to 
be readily extensible to the problem of Bolza without making stringent nor- 
mality assumptions. A very special case of this necessary condition has been 
given by Bliss for variable end point problems in the plane (IV, pp. 324-6). 

The sufficiency proof given in §§ 6 and 9 below is new and is simpler than 
those given hitherto for the problems of Bolza. It is a direct extension of the 
classical method used for fixed end point problems and does not make use of 
the famous theorem of Hahn (IX, p. 267). 

The author has made extensive use of the papers of Bliss and Morse listed 
at the end of this paper. 

2. First necessary conditions. Let us suppose that we have given an open 
region of points (x, y, y’) in which the functions f, ds have continuous de- 
rivatives of the first three orders. A set (x, y, y’) is said to be admissible if 
it is in ® and satisfies the equations ¢;=0. A differentiably admissible arc is 


1934] THE PROBLEM OF BOLZA 795 


a continuous arc having a continuously turning tangent except possibly at a 
finite number of points on it and having all of its elements (x, y, y’) ad- 
missible. A differentiably admissible arc (1:1) and a set of constants 
(a) =(a1, -- +, a@,) satisfying the end conditions (1:3) are said to form an 
admissible arc. 

We center our attention on a particular admissible arc g and propose to 
find under what conditions g will surely furnish a minimum to J relative to 
neighboring admissible arcs. We assume that the matrix ||¢,,,|| has rank m 
on g and that the set (a) belonging to g is the set (a) =(0). The functions 
6(a), x*(a), yi*(ax) (s=1, 2) are assumed to have continuous first and second 
partial derivatives near (a) = (0). 

The tensor analysis summation convention will be used throughout. 

The following necessary condition is well known and has been established 
by Morse and Myers (VI, p. 245) and by Bliss and Schoenberg (X, pp. 681-3) 
and by others. 


THEOREM 2:1. If g affords a minimum to J then there exist for it constants 
Ci, and a function F =of (B=1, - - , m) such that the equa- 
tions 


(2:1) Fw =f (i=1,---,m) 
hold at every point of g. Moreover on g the equation 


(2:2) [(F — + + = 0* 


is an identity in da, when the differentials dx', dy}, dx*, dy?, d0 are expressed 
in terms of the differentials da,. The multiplier do is a constant. The multi- 
pliers \s(x) are continuous except possibly at values of x defining corners of g. 
The elements of the set o, \s(x) do not vanish simultaneously at any point on g. 


By the order g of anormality of g on an interval x'x"’ relative to the condi- 
tions (2:1) is meant the number g of linearly independent sets of multipliers 
of the form \»=0, s(x) with which g satisfies the conditions (2:1) on x’x”’. 
The order g of g on x’x”’ cannot exceed the number m of differential equations 
¢s=0. This follows because for every m+1 sets of multipliers of the form 
Ao=0, Ag(x) there exists at least one linear combination of these sets having 
constant coefficients not all zero and vanishing at x’ and hence vanishing for 
all values of x on x’x’’. The case g=0 on every sub-interval of xx? has been 


* The symbol [ J} denotes the value of [ ] at the final end point 2 on g minus its value at 
the initial end point 1 on g. 


4 
j 
§ 
| 


796 M. R. HESTENES [October 


treated by Morse and Bliss. In this case g is said to be normal on every sub- 
interval. 

Carathéodory (XV, XVI) has shown that in the analytic case the order ¢ 
of anormality of g is the same on every sub-interval of x'x*. In the non- 
analytic case this is not necessarily true, as will be seen in the example given 
at the end of §9. 

By the order p of anormality of g relative to the conditions (2:1) and (2:2) 
is meant the number # of linearly independent sets of multipliers of the form 
Ao=0, Ag(x) with which g satisfies the conditions (2:1) and (2:2). Clearly 
the order of g cannot exceed the order g of g on the interval x'x? defined by 
its end points. If =0 then g is said to be normal. In the normal case there 
exists an infinity of admissible arcs in every neighborhood of g. In the anormal 
case this is not necessarily true. Moreover for a normal minimizing arc g there 
exists a unique set of multipliers of the form \»=1, Ag(x) satisfying the con- 
ditions of Theorem 2:1 (V, pp. 693-5). 

We have the following analogue of the necessary condition of Weierstrass 
which has been established by Graves (XIII, p. 751). 


THEOREM 2:2. If g is a normal minimizing arc then at each element (x, y, 
y’, \) on g the inequality 


E(x, y, y’,r, Y’) 20 


must hold for every admissible set (x, y, Y’)¥(x, y, y’) whose matrix 
Y")|| has rank m, where 


E(x, y, = F(x, y, Y’, d) 
F(x, d) =r (Yi yi MF yy (x, d). 
The analogue of the necessary condition of Clebsch given in Theorem 4:5 
below can also be obtained from Theorem 2:2 by the arguments given by 
Bliss (V, pp. 718-9). 
An extremal arc is defined to be a differentiably admissible arc and a set 
of multipliers 


(2:3) 


yi = yi(x), As = (x! S 


having continuous derivatives y/, y/’, \g and satisfying with \»=1 the 
Euler-Lagrange equations 


(2:4) (d/dx)F y, — F,, = 0, os = 0. 
An extremal is said to be non-singular if the determinant 


FP 
0 


1934] THE PROBLEM OF BOLZA 797 


is different from zero at each element (x, y, y’, A) on it. A study of the ex- 
tremal family has been made by Bliss (V, p. 687). 

In the sequel it will be understood that the admissible arc g under considera- 
tion is an extremal arc satisfying the conditions (2:1) and (2:2) of Theorem 2:1 
unless otherwise expressly stated. 

3. The second variation and the accessory minimum problem. In this sec- 
tion we are concerned with the functional 


J2(n, = + 2w(x, 0, (h,l=1,---,7) 


evaluated along the extremal g, where (s not summed; s = 1, 2) 


bar = + — + (F- yl Fy) 
ee 8 2 
+ Fy + + 
20 = + 2F + Py (i, k=1,---,m). 


Here the symbols x’, y;* denote the functions x*(a), y;*(a) and the subscripts 
h and | denote differentiation with respect to a, and a; respectively at 
(a) = (0). The matrix ||5,:|| is symmetric. The functions 7;(x) are assumed to 
possess continuous derivatives except possibly at a finite number of values of 
x on the interval x!x? and to satisfy with the constants w, the equations 


n’) = Ni + 0, 
= (s=1,2;4=1,---,7) 


evaluated along g, where = yin*(0) —y/ (x*)x,°(0) (s not summed). Such a 
set 7:, Ws is called a set of admissible variations for g. The functional J2(n, w) 
is called the second variation of the functional J along g(cf. VIII, pp. 520-1). 


THEOREM 3:1. If g is a normal minimizing extremal arc, then along g the 
second variation J, of J must satisfy the condition J2(n, w) =0 for every set of 
admissible variations ni, w, having continuous second derivatives except possibly 
at a finite number of values of x on the interval x'x* defined by the end points of g. 


The theorem follows readily from the derivation of the second variation 
given by Morse (VIII, pp. 520-1) provided that we show that for every set 
of admissible variations y;, w, having the continuity properties described in 
the theorem there exists a one-parameter family of admissible arcs 


yi(x, e), an, = a,(e) zs x?(a) | 


containing g for e=0, having 7;, w, as its variations along g, and having the 
following continuity properties. The functions y;(x, e),a,(e) have continuous 


j 
4 
4 
4 
4 
34 
a 
| 
i 


798 M. R. HESTENES [October 


first and second derivatives with respect to e near e=0. The derivatives 
Vizey Vizee) Yizz Exist and are continuous for values (x, e) near those belonging 
to g except possibly at a finite number of values of x on x!x*. The existence 
of such a family is readily established by the methods of Bliss (V, p. 695: cf. 
VI, p. 249) with suitable modifications in order to obtain the necessary de- 
rivatives. 

Theorem 3:1 leads us to the study of the accessory minimum problem, 
namely, the problem of minimizing the functional J2(n, w) in the class of ad- 
missible variations 7;, w,. This problem is a problem of Bolza of the type de- 
scribed in §1. From Theorem 2:1 we obtain the following equations which a 
minimizing arc without corners must satisfy: 


(3:1) (d/dx)Q,,, — Q,, = 0, Bs = 0 (6 = 1,---,m), 
(3:2) — = 0 = 1, 2), 
(3:3) + (k,l =1,---+,7), 


where Ps, 2,,. The equations (3:1) are known as the ac- 
cessory equations, the equations (3:2) as the secondary end conditions, the 
equations (3:3) as the secondary transversality conditions. The extremals for 
this problem will be called secondary extremals. The secondary end conditions 
are said to be regular in case the 2nXr-dimensional matrix ||c;,‘|| has rank r 
on g. 
If g is non-singular the equations 
= 2,0’, u), 0, 0’) = 0 


with uo=1 can be solved for the variables n/, us. The accessory equations 
with uo =1 are then found to be equivalent to equations of the form 


(3:4) dni/dx = G,(x, dg;/dx = H,(x, 5), 


where G;, H; are linear in the variables 7;, ¢; (V, p. 727). For every pair of 
solutions 7;, ¢; and v; of these equations the expression 7,0; is a con- 
stant (V, p. 738). If this constant is zero the solutions are said to be conjugate 
solutions. A set of nm mutually conjugate linearly independent solutions is said 
to form a conjugate system. 

In the separated end point case the quadratic form b,,;w,w, is of the form 


1 2 
— (u,v = 


where 
1 


Bur = Ou» — (Fe — yi — (F — yl 


(3:5) 11 
Fy Vir + Xv Vin) Vin» 


1934] THE PROBLEM OF BOLZA 799 


evaluated at the initial point 1 on g and 4,2 is a similar expression in 6,7, 
XP, Vid, Xe7, Vier evaluated at the final end point 2 on g. The matrices 
||b,.'|| and ||5,,?|| are symmetric. Moreover the equations (3:2) and (3:3) 
with uo =1 can be written in the form 


1 1 11 1 
(3:6) Ni = CipWy, = OyyW, (nu, 1, p), 


(3:7) n= Cie We, = Der, (o,r =p t+1,---,7). 


If the matrix ||c;,'|| has rank p then there are m and at most linearly inde- 
pendent solutions 


(3:8) nik(%), Wyk (k 5, n) 


of equations (3:4) and (3:6), as one readily verifies. Moreover the secondary 
extremals 7x, ¢:, in (3:8) form a conjugate system since at x =x! we have 


1 1 
— Semin = — 


1 1 
=> Wy; = 0 


and since these secondary extremals are linearly independent, as follows read- 
ily from the fact that the matrix ||c,2|| has rank p. Similarly if the matrix 
|c:2|| has rank r—p, then there are m and at most linearly independent 
solutions 


(3:9) Usk(X), VielX), Wor 


of equations (3:4) and (3:7). It is clear that the secondary extremals ux, ie 
also form a conjugate system. 
The following lemma will be useful: 


Lemma 3:1. The order p of anormality of g is equal to the number of linearly 
independent secondary extremals n:, us having (n) =(0) on x'x? and satisfying 
the equations (3:3) with the set (w) =(0). 


This result follows because the first m equations (2:4) with \»=0 are 
equivalent to the first » equations (3:1) with (n) =(0). Moreover the trans- 
versality condition (2:2) with Axo =0 is equivalent to the conditions (3:3) with 
(n) =(0) and (w) =(0). 

Similarly we have 


Lemna 3:2. The order q of anormality of g on the interval x'x* is equal to the 
number of linearly independent secondary extremals 7;, us having (n)=(0) on 


A further lemma is the following: 


| 
(k =1,---,#) 

bs | 

4 

if 


800 M. R. HESTENES [October 


Lemma 3:3. If uj, 0; is a secondary extremal having (u) =(0) on xx? then 
the relation vin; =constant holds for every differentiably admissible arc ni(x) for 
the accessory minimum problem. 


Let us =Ag(x) be the multipliers belonging to the secondary extremal u;, 2;. 

The lemma now follows readily by multiplying the equations of variation 
py + = 0 

by the functions Ag(x), adding, and applying the usual integration by parts 

with the help of equations (2:4) with A, =0. 

An important consequence of Lemma 3:3 is that the accessory minimum 
problem can be modified so that its admissible arcs are all normal. This can 
be done by replacing the secondary end conditions (3:2) by the conditions 
(3:10) = Cian, = + (y =1,---, 
where ? is the order of anormality of g and 7;,, i, are p linearly independent 
secondary extremals having the properties described in Lemma 3:1. We may 
suppose that these secondary extremals have been chosen so that the columns 
of the matrix ||¢;?|| are normed and orthogonalized. By Lemma 3:1 we have 


- 22 1 1 
(3:11) — = O (y= 1,---,). 


Multiplying the equations (3:10) by the values —f;,', ¢:2 and adding, it is 
found with the help of equations (3:11) and Lemma 3:3 that the equations 


23 
0= — Sins + Signi = Wray (y = 1,---,p) 


hold for every admissible arc 7;, w,, w,+, for the new problem. The new prob- 
lem is therefore equivalent to the original one. Moreover every admissible 
arc for the new problem is normal, by Lemma 3:1, since the secondary ex- 
tremals ni7, [i,, described above, do not satisfy the analogue of conditions 
(3:3) with the set w,=w,,,=0. 


Lemma 3:4. If g is non-singular then a minimizing arc ge for the accessory 
minimum problem must be an arc defined by a secondary extremal and hence can 
have no corners. 


As was seen above we may assume that g2 is normal. Since gs is a minimiz- 
ing arc there exists for it a unique function 2=w+y,®z with which g; satisfies 
the conditions implied by Theorem 2:1. The functions [;=@,, are there- 
fore continuous along g2. The non-singularity of g and hence of gs implies 
that the functions 7;, us belonging to ge define extremal segments between 
corners of g2 (V, p. 684). The continuity of the functions ¢; now implies that 


1934] THE PROBLEM OF BOLZA 801 


the arc gz can have no corners since there is one and only one secondary ex- 
tremal taking given values 7,°, ¢;° at a value x=x° on x'x*. This proves the 
lemma. 

4. Necessary conditions for the second variation to be positive. The sec- 
ond variation J2(7n, w) is said to be positive along g if the inequality J2(n, w) =0 
holds for every set of admissible variations 7;, w, belonging to g. The results 
of this section will remain valid if we further restrict these variations to have 
the continuity properties described in Theorem 3:1. The necessary conditions 
here given must therefore be satisfied if g is to be a normal minimizing arc 
for the original problem. 

We have the following necessary condition in the separated end point 
case. The relations between this condition and those of Currier and Bliss have 
been explained in §1. 


THEOREM 4:1. If in the separated end point case the extremal g is non- 
singular, the secondary end conditions are regular, and the second variation J, 
is positive along g, then at each point x* on x'x? the inequality 


(4:1) — = O (i,j,k =1,---,m) 
must hold for every set of constants (a;, bx) satisfying the equations 

(4:2) = uin(x*)d, 

where nij, Fi; and Uix, Viz are the conjugate systems belonging to the sets (3:8) and 
(3:9) respectively. The coefficients in the bilinear form (4:1) are constants. 


In order to prove the theorem we note that a set of constants (a;, bx) 
satisfying the equations (4:2) determines a broken secondary extremal nj, ¢; 
defined by the equations 
(4:3) mi = = On SxS x, 

= on x3 x x? 
and satisfying the conditions (3:6) and (3:7) with the set of constants 
Wy We Let be the set of multipliers belonging to the 
broken extremal 7;, ¢;. With the help of the formula 


(4:4) 22 = niQ,, + ni Q,, + 


and the usual integration by parts it is found that along this broken extremal 
the second variation J, is expressible in the form 


1 2 z 
Je by» bor Wo Ws f 2Qdx f 20dx 
xl x 


1 2 2 2-0 
ber We Ws + + [nits] 


+ 

4 

i 

‘| 

tal 


802 M. R. HESTENES [October 


By the use of equations (3:6), (3:7), (4:2), and (4:3) it follows readily that 
Jo = ni(x? + + — 0) 


= — 


This formula justifies the inequality (4:1). The last statement in the theorem 
follows from the remarks made in the paragraph containing the equations 
(3:4). The theorem is now proved. 

Consider now the problem of Bolza in which the end conditions are not 
necessarily separated. Suppose for the moment that g is non-singular. Let 
Nips Map Wap (O=1, - - -,v) bea maximum set of linearly independent second- 
ary extremals and constants (w) satisfying the secondary end conditions 
(3:2). It is clear that the quadratic form 


(4:5) Q(z) = WoZp) 


in the constants (z:, - - - , z,) must be positive on g if the second variation J; 
is to be positive along g. This proves the first part of the following theorem: 


THEOREM 4:2. If the extremal g is non-singular and the second variation 
J, is positive along g, then the quadratic form (4:5) must be positive on g. More- 
over at each point x* on xx? the inequality (4:1) must hold for every set of con- 
stants (a;, satisfying the equations (4:2), where ni;, and Ux, Vix are con- 


jugate systems of secondary extremals having ni;(x') =uix(x*) =0. 


The last part of the theorem is obtained by applying Theorem 4:1 to the 
case in which the secondary end conditions are of the form n? =0, n?2 =0. 

A value x*#z' is said to define a point 3 conjugate to 1 on g if there exists 
a secondary extremal n;=;(x), us=us(x) having u,(x') =u,(x*) =0 but not 
(u) =(0) on xx’. 

The following necessary condition is a direct extension of a condition 
given by Bliss (IX, p. 266). 


THEOREM 4:3. If the extremal g is non-singular and the second variation J, 
is positive along g, then the quadratic form (4:5) must be positive on g. Moreover 
there can be no point 3 conjugate to 1 on g between its end points 1 and 2 defined 
by a secondary extremal u,(x), pa(x) with (u’)#(0) at x=x*. If the order q of 
anormality of g is the same on every sub-interval x*x? of x'x*, then there can be no 
point 3 conjugate to 1 on g between 1 and 2. 


For if there were a point 3 conjugate to 1 on g between 1 and 2 defined 
by a secondary extremal u;, ps, then along the arc 


mn=u(x) x), wm=O0 


1934] THE PROBLEM OF BOLZA 803 


the second variation would take the value zero (V, p. 726). This arc would 
therefore be a minimizing arc for the accessory minimum problem and hence 
could have no corners, by Lemma 3:4. This proves the first statement con- 
cerning conjugate points. 

In order to prove the last statement of the theorem we note that accord- 
ing to Lemma 3:4 the functions 7; just defined would belong to a secondary 
extremal 7;, us. The functions 7; would then be identically zero on x'xz? since, 
as one easily sees, Lemma 3:2 and our assumption concerning anormality 
imply that a secondary extremal 7;, us having 7;=0 on x*x* has 7;=0 on the 
whole interval xx. It follows that in this case there can be no point 3 on g 
conjugate to 1 between 1 and 2. This completes the proof of the theorem. 

By the accessory boundary value problem is meant the equations 


(d/dx)Q,, — 2,, + on; = 0, Bs = 0, 
(4:6) 


2 1 
Ni = = + = 0 (s = 1, 2), 


where Q=w+yps®z. A set of functions 7;(x), us(x) having continuous deriva- 
tives n/, ni’, n¢ and having (n) (0) on xx? is said to form a characteristic 
solution if it satisfies the equations (4:6) with a set of constants w,, o. The 
corresponding value a is called a characteristic root. 

We now have the further necessary condition: 


THEOREM 4:4. If the second variation J2(n, w) is positive along the extremal 
g then there can be no negative characteristic roots of the accessory boundary value 
problem. 


The proof of this theorem is well known (VIII, p. 524). 
We also have the further necessary condition which is an analogue of the 
necessary condition of Clebsch: 


THEOREM 4:5. If the second variation J2(n, w) is positive along the extremal 
g, then at each element (x, y, y’, \) on g the inequality 


(4: 7) Py 20 


must hold for every set (x) #(0) which is a solution of the equations $py,7:=0. 
If g is non-singular then the condition (4:7) holds with the equality sign ex- 
cluded. 


According to the remarks preceding Lemma 3:4 we may suppose that gis _ 
normal. The first statement of the theorem can now be obtained by applying 
Theorem 2:2 to the accessory minimum problem and by the use of Taylor’s 
expansion. The last statement follows readily from well known theorems on 
quadratic forms. 


i 
4 
4 
¢ 


804 M. R. HESTENES [October 


5. Criteria for conjugate points. A first criterion for conjugate points is 
the following one: 


THEOREM 5:1. If the extremal g is non-singular and if the functions 
Uij, Vig (7 =1, - - - , 2m) form 2n linearly independent secondary extremals for g, 
then a value x*A~x' defines a point 3 conjugate to 1 on g if and only if the matrix 

u; x1) 
has rank less than 2n—q, where q is the order of anormality of g on the interval 

The proof of this theorem can be made by the usual methods (V, p. 728) 

with the help of Corollary 3:2. 


THEOREM 5:2. If the extremal g is non-singular and the order q of anormality 
of g is the same on every sub-interval x'x* of x'x*, then there exists for g a con- 
jugate system nix, fix of secondary extremals such that the points 3 conjugate to 1 
on g are determined by the zeros x*~x' of the determinant | nix| . 


If g=0 then it suffices to choose the secondary extremals nix, ¢:. which 
take the initial valaes ny(x') =0, ¢;:(%") =6:., where 6;, is the Kronecker 
delta. This follows readily from Theorem 5:1 by choosing the first ” second- 
ary extremals of the set u;;, v;; to be the set nix, Fix. 

If g>0 we choose the first » secondary extremals of the set u;;, v;; of 
Theorem 5:1 such that u;,(x') =0 (k=1, - - - , uiy(x) =0 (y=1,---, 9) 
on x!x%, and such that the columns of the matrix ||v;,(«")|| are normed and 
orthogonalized. The second secondary extremals of this set are chosen so 
as to take the initial values w;,n4%(x') =v;2(%"), vin4%(x') =0. The secondary 
extremals Cix=Vi,q+% Can now be shown to have the properties 
described in the theorem. An examination of their values at x =x! will show 
that they are mutually conjugate. Moreover it is clear that the matrix (5:1) 
has rank 2n—q if and only if the matrix (7-=1, - - - , #—@) 
has rank n—g at x=x*. The theorem will now follow from Theorem 5:1 if 
we show that the determinant |x| is different from zero if and only if the 
matrix ||7;,|| has rank »—q. If the determinant |,.| vanishes at a value 
x=x' then there exist constants a, not all zero such that the equations 
niz(x*)a,=0 hold. By the use of Lemma 3:3 and by a consideration of the 
initial values of the secondary extremals under consideration it is found that 


O = = = Gn-giy (Y = 1,---, 


The matrix ||n;,|| =1, - - - ,2—g) must therefore have rank less than n—g 


1934] THE PROBLEM OF BOLZA 805 


whenever the determinant | 7;.| vanishes. The converse is immediate, and 
the theorem is established. 

6. A fundamental sufficiency theorem. The notion of a Mayer field § 
used here is that given by Bliss (V, p. 730). The slope functions and the multi- 
pliers belonging to § will be denoted by the symbols ;(x, y), As(x, y). The 
Hilbert integral 


[*= f {F(x, A)dx + (dy; pidx)F P; r)} 
formed for these functions and \)=1 is independent of the path in §. The 


value of the integral J* along an extremal of the field is equal to that of the 
integral 


(6:1) r= 9, 


The Weierstrass E-function E(x, y, p, \, y’) is the expression (2:3). 
If g is an extremal of a Mayer field then the transversality condition (2:2) 
for g implies that the equation 


(6:2) [dr*]; + do =0 


is an identity in da, on g when the differentials dx'!, dy}, dx”, dy?, d@ are ex- 
pressed in terms of the differentials da,. It follows readily that on g the second 
differential 


(6:3) + 


is a quadratic form in the variables da,. With this in mind we can prove the 
following theorem: 


THEOREM 6:1. Let § be a Mayer field in which the inequality 
(6:4) E\x, 9, p(x, y), M(x, 9), > 0 


holds for every admissible set (x, y, y’) ¥(x, y, p). If g is an extremal of the field 
such that the equation (6:2) is an identity in da; on g and such that the quadratic 
form (6:3) is positive definite on g, then g affords a proper minimum to J rela- 
tive to admissible arcs C in § with sets («) near (0). 


Let A', A* be the arcs in § defined by the equations 


(A*) ye = (OS¢51;5=1,2) 


for a set (a) near (0), where the functions on the right are those appearing in 
equations (1:3). The condition (6:2) and the positive definiteness of the 


4 
} 
7 


806 M. R. HESTENES [October 


quadratic form (6:3) tell us that the set (a) = (0) furnishes a proper minimum 
to the function 


W(a) = 0(a) — 0(0) + I*(A*) — I*(A?) 
relative to sets (a) near (0). 
Suppose now that the set (a) belongs to an admissible arc C in §. With 


the help of the formula (2:3) and the invariant property of the integral J* 
it is found that 


= ff + I*(A2) — 1*(A}), 
c 


where J is the integral (6:1). When the expression 0(a) —0(0) is added to both 
sides of the last equation, the formula 


J(C) — J(g) = f Edx + W(a) 
Cc 


is obtained. Hence we have J(C) =>J(g) provided that the set (a) belonging 
to the arc C is near (0). The equality holds only in case (a) =(0) and the 
integral of the Z-function vanishes, that is, only in case the ends of C coincide 
with those of g and the equations y/ — p;=0 hold along C. The arc C would 
then be an extremal of the field and would coincide with g since there is but 
one extremal of the field through each point of § (cf. V, pp. 731-2). 

In the sequel we shall apply Theorem 6:1 only to the problem of Bolza 
with separated end points. If the end conditions are not of the form (1:4) 
then it is not always possible to construct a field such that the quadratic 
form (6:3) is positive definite on g. This can be seen by considering the special 
problem in (xyiy2)-space for which 6 =a, f=0, x'(a) =0, y? (a) =0, y2 (a) = —a, 
=1, y?(a)=0, y?(a)=a, and =0. The sufficient 
conditions given in §9 below, however, are applicable to this problem. 

7. Three lemmas. Consider first the problem of Bolza with separated end 
conditions. Suppose that g is non-singular and that the secondary end condi- 
tions are regular on g. The arc g will be said to satisfy the condition IV’ if 
at each point 3 on g the inequality (4:1) holds subject to the conditions (4:2) 
and if furthermore the matrix 


(7:1) 


of the coefficients in the bilinear form (4:1) has rank n—p on g, where p 
is the order of anormality of g. The matrix (7:1) has rank »—p on g if and 


1934] THE PROBLEM OF BOLZA 807 


only if the equations (3:4), (3:6), (3:7) have no solution (7, ¢, w) other than 
those described in Lemma 3:1. In the fixed end point case the matrix (7:1) 
has rank n—> if and only if the end points of g are not conjugate to each 
other, as is readily verified. 


Lemma 7:1. If in the separated end point case the extremal g is non-singular 
and satisfies the condition IV’ and tf the secondary end conditions are regular 
on g, then there exists for g a conjugate system U;x, Vix of secondary extremals 
whose determinant | U;x(x)| is different from zero on the interval x1 <x <x? de- 
termined by the end points 1 and 2 of g. Moreover the inequalities 


hold for every set of constants (ax, 2.) ~(0, 0) and (b:, 2.) ~(0, 0) satisfying the 
equations 


(7:2) 


For, if p is the order of anormality of g, then by virtue of Lemma 3:1 we 
can select the first p solutions of the sets (3:8) and (3:9) so that on x'x? we 
have 


(7:4) = tin (x) = 0, (%) = (y = 1,---,p) 
and so that the columns of the matrix ||¢,,(x")|| are normed and orthogonal- 
ized. We then select the remaining solutions of these sets so that the relations 
(7:5) = 0, = = 0, 

(7:6) — Niadis = Sap (a,B = p+1,---,n) 


hold, where 5.8 is the Kronecker delta. In order to obtain the relations (7:6) 
we note that, since the conjugate systems 7;x, ¢:, and x, 0: have the second- 
ary extremals (7:4) in common, it follows that 


(7:7) Satin — = 0, Sixty — =O = 1,---, 
The determinant 
(7:8) Siatip — Niadis | (a,8 = p+1,---,n) 


must therefore be different from zero if the matrix (7:1) is to have rank n—p. 
The relations (7:6) are now obtained by replacing the solutions nia, fia, Wua by 
the solutions as, af; as, Where the matrix || A is the reciprocal 
of the matrix (7:8). 

The secondary extremals U;;, V ;, taking the initial values 


= 4 
4 
4 
4 


808 M. R. HESTENES [October 


Uig(x!) = Uial x!) = nia(%!) + (Y= 


7:9 
= 0, = + (a = p+1,- 


can be shown to have the properties described in the theorem. In the first 
place these secondary extremals are mutually conjugate, as is easily seen, 
with the help of equations (7:5), (7:6), (7:9) and the conjugacy of the sys- 
tems ix, (ix and u;x, 24x. Moreover the determinant | U;.(x)| is different from 
zero on x'x”. In order to prove this we use the relations 


(7:10) = Oye, = 0 


which hold identically on x'x? by virtue of Lemma 3:3 and the equations 
(7:4), (7:7), (7:9) together with the fact that the columns of the matrix 
||¢:(#")|| are normed and orthogonalized. If now the determinant | U;.(x)| 
were zero at a point x* on x'x* then there would exist constants c; not all 
zero such that U,,(x*)c,=0. By multiplying these equations by ¢;,(x*), add- 
ing, and using equations (7:10) it would follow that the constants c, - - - , Cp 
would all be zero and hence that 


Vial x*)Ca = + Uia(x*)Co =O (2 = pt+1,---,m). 


The equations (4:2) would then be satisfied by the set a, =cx, 6, = —c;, and 
for these constants the bilinear form (4:1) would take the value daba = —Cala 
<0 by virtue of the equations (7:6) and (7:7). But this would contradict the 
condition IV’. The determinant | U,| must therefore be different from zero 
on x'x?. 

We shall now establish the first of the inequalities (7:2). In order to do 
this we first note that the constants a, -- - , @, in equations (7:3) are all 
zero, as can be easily seen, by multiplying the first ” of these equations by 
¢s(x'), adding, and applying the equations (7:10) and the analogue of equa- 
tions (3:11) for the separated end point case. We use the abbreviations 
(7:11) Ni = Niada, $i = Siaday Wy = Wyaday 

Ui = Ujala, Vi = Viala, = Wy t Wye (a= p+i1,---,n) 
and find that the set ;, ¢:, w, satisfies the equations (3:6). Moreover by the 
use of equations (3:6) and (7:3) it follows readily that at x =x! 


1 1 
T , 
Unde — = Ui — = 0 


1 1 
by» = = 


1 1 


With the help of the last two formulas and the equations (7:11) it is found 
that the first member of the relations (7:2) is expressible in the form 


1934] THE PROBLEM OF BOLZA 


+ wi + wl) — (me + + 00) 


= — um] + — nevi). 


The second bracket is equal to the sum a.a, by virtue of equations (7:6) and 
(7:11) and is positive unless the constants a, are all zero, in which case the 
constants z, in equations (7:3) are also all zero since the secondary end con- 
ditions are regular. The first bracket in the last equation is positive or zero. 
For, as a consequence of the regularity of the secondary end conditions there 
exists for every set of constants w,/ a secondary extremal ni, {io satisfying 
the conditions (3:6) with w,=w,. The set ni, fio, w/ is expressible linearly 
with constants c, in terms of the set (3:8). Moreover it is clear that 
nio(x') =u,(x'). Hence we have at 


1 
byvWy Wy — UV; = — 


= — niods = — 


and this expression must be positive or zero by IV’. This proves the first in- 
equality (7:2). The second can be established by the same method. The proof 
of Lemma 7:1 is now complete 

We also have the further useful lemma: 


Lemma 7:2. If the extremal g is non-singular and its end points are not con- 
jugate to each other, then the end points of every differentiably admissible arc go 
for the accessory minimum problem can be joined by a secondary extremal. 


To prove this let g be the order of anormality of g on x'x* and suppose 
that the first g secondary extremals u;,, vi, (y=1, - - - , g) of the set u;;, 03; 
(j=1, - - -, 2m) appearing in Theorem 5:1 have been chosen so that u;,=0 
on x!x*. Since the end points of g are not conjugate, the end values of the 
remaining 2n—g secondary extremals of this set form a set of 2n —q linearly 
independent solutions of the equations 


Viy(x?)n? = Viy( x") (y 1, q); 


by Lemma 3:3, and every solution n?, 7? of these equations is expressible 
linearly in terms of these 2n—g solutions. The end points of g2 satisfy these 
equations, by Lemma 3:3. This proves Lemma 7:2. 

By the Clebsch condition III’ is meant the conditions of Theorem 4:5 with 
the equality sign excluded. The condition III’ for g implies that g is non- 
singular (V, p. 735). We can now prove the following lemma: 


809 
| 


810 M. R. HESTENES [October 


Lema 7:3. If the extremal g sctisfies the condition III’ and if there exists 
for g a conjugate system Uix, Vix of secondary extremals whose determinant 
| U ix(x) | is different from zero on x'x*, then every secondary extremal uj, v; is 
an extremal of a Mayer field defined over the region §o of points (x, n) whose x- 
projections lie on the interval x'x*. Moreover the analogue of the condition (6:4) 
holds in §a. 


For, the n-parameter family of secondary extremals 
(7:12) mi = + = 05 + Vince 


contains the extremal u;, v; for values (2) = (0) and simply covers the region 
a. Moreover the Hilbert integral 7* formed for the function 20 is inde- 
pendent of the path on the hyperplane «=x. The family (7:12) therefore 
defines a field over Ja (V, p. 733; VII, p. 571). The last statement in the 
lemma follows at once from the condition III’ by the use of Taylor’s expan- 
sion. 

8. Necessary and sufficient conditions for the second variation to be posi- 
tive definite. The second variation /2 is said to be positive definite along the 
extremal g if the inequality J2(n, w) >0 is true for every set of admissible 
variations (n, w) (0, 0) belonging to g. 


THEOREM 8:1. If in the separated end point case the extremal g is non- 
singular and the secondary end conditions are regular on g, then the second varia- 
tion J2o(n, w) is positive definite along g if and only if the conditions III’ and 
IV’ hold along g. 


The necessity of the conditions III’ and IV’ follows at once from Theo- 
rems 4:1, 4:5 and the remarks preceding Lemma 7:1. The sufficiency of these 
conditions follows readily from Lemmas 7:1, 7:3 and Theorem 6:1 applied to 
the secondary extremal u;=v;=0, the conditions (7:2) and (7:3) implying 
the positive definiteness of the analogue of the quadratic form (6:3). 

We now turn to the case in which the end conditions are of the form (1:3). 
By the condition V' is meant the necessary conditions of Theorem 4:2 with 
the added assumption that the equation Q(z) =0 holds only in case ;,2,=0, 
W ipZp=0 on x'x*. The condition V’ for g prevents its end points from being 
conjugate to each other. For if the end points 1 and 2 of g were conjugate then 
there would exist a secondary extremal 7;, us with 7;(x') =n:(x?)=0 and 
(n) 4(0) on x!x*. The set 7;, us, w,=0 would then be expressible linearly with 
constants z, in terms of the set 7, usp, @»» appearing in the definition of the 
quadratic form (4:5). For these values of (z) we would have Q(z) =0, as is 
easily seen, with the help of the formula (4:4) and the usual integration by 


1934] THE PROBLEM OF BOLZA 811 


parts. It follows that the end points of g cannot be conjugate if the condition 
V’ is to hold along g. 


THEOREM 8:2. If the extremal g is non-singular then the second variation 
J2(n, w) is positive definite along g if and only if the conditions III’ and V’' hold 
along g. 


It is clear that the conditions III’, V’ are necessary. In order to show that 
they are sufficient we note first that the condition V’ for g implies the condi- 
tion IV’ for the fixed end point case. Lemma 7:1 now tells us that there exists 
a conjugate system Ux, Vx of secondary extremals whose determinant 
| Ux(x)| is different from zero on x'x?. From Lemma 7:3 and Theorem 6:1 
we conclude that every secondary extremal u;, v; affords a proper minimum 
to the integral 


I, = f 2w(x, n, n’)dx 
relative to differentiably admissible arcs n,(x) joining its end points. 
Suppose now that 7;, w, is an admissible arc for the accessory minimum 
problem. By Lemma 7:2 there exists a secondary extremal w;, v; joining its 
end points. We have accordingly 


(8:1) J2(n, w) J(u, w) I2(n) I2(u) = 0, 


the equality being valid only in case (yn) =(u). From the definition of the 
quadratic form Q(z) it is clear that there exist constants z, such that 
Q(z) =J2(u, w). From the condition V’ and the relation (8:1) we now con- 
clude that J2(n, w) >0O unless (n, w) =(0, 0), as was to be proved. 

The extremal g will be said to satisfy the condition VI’ if the quadratic 
form (4:5) is positive on g and vanishes only in case 7;,2,=0, wa Zp =0 on x'x?, 
and if furthermore there is no point 3 conjugate to the initial point 1 on g. 
We can now prove the following theorem: 


THEOREM 8:3. If the extremal g is non-singular and the order q of anormality 
of g is the same on every sub-interval x'x* of x'x*, then the second variation 
J2(n, w) is positive definite along g if and only if the conditions III’, VI’ hold 
along g. 


The proof of this theorem is like that of Theorem 8:2 provided that we 
can show that there exists for g a conjugate system U;;, V;; of secondary ex- 
tremals having its determinant |Ux(x)| different from zero on x!x*. This 
latter result will be obtained by a method first used by Morse (VII, pp. 
574-6) for the problem of Lagrange and later adapted to the problem of 


812 M. R. HESTENES [October 


Mayer by Bliss and Hestenes (XVII, pp. 320-2). In the proof we suppose 
that the conjugate system 7:x, ¢: of Theorem 5:2 has been chosen to take 
the values at x=x?, where is the Kronecker delta and B;,= Bi. 
“Lemma 8:2” of Bliss and Hestenes now holds as before. Similarly “Lemma 
8:3” is true, as is easily seen with the help of the following remarks. Although 
a secondary extremal 7;, ¢; joining the points (x, 7) =(x', 0) and (x, ) =(x?, a) 
is not necessarily an extremal of the field it has associated with it a secondary 
extremal {:—CyViy (Y=1, - - - , g) belonging to the field, where g 
is the order of anormality of g on x'x? and u;,, vi, are g linearly independent 
secondary extremals having u;,=0 on x'x*. Moreover the values of the in- 
tegral “J,” along these two extremals are the same. The remainder of the 
proof is now like that of “Theorem 8:1” of Bliss and Hestenes. 

We now turn to the accessory boundary value problem. Its characteristic 
roots are all real (XIV, p. 774; XIX, p. 394). We have the following theorem: 


THEOREM 8:4. If the extremal g is non-singular and the secondary end con- 
ditions are regular on g, then the second variation J2(n, w) is positive definite 
(positive) along g if and only if the condition III’ holds along g and the character- 
istic roots of the accessory boundary value problem are all positive (non-negative). 


According to the remarks preceding Lemma 3:4 we may suppose that g 
is normal. The theorem then follows from a result given by Hu (XIX, p. 413). 

The theorem can also be established with the help of the condition V’. 
A method will be outlined briefly as follows. We first replace the integrand 2w 
in the functional J2(n, w) by 2w—on.; and obtain a functional J2(7, w, ). 
The preceding theorems concerning the functional J2(7, w) are valid also for 
the functional J2(n, w, 7) when the obvious changes due to the introduction 
of the parameter o are made. By an argument like that given by Morse 
(VIII, pp. 533-4) it is found that for o sufficiently large and negative the 
functional J2(n, w, ) will be positive definite relative to sets of admissible 
variations (n, w) #(0, 0). Let o» be the least upper bound of the values of ¢ 
for which J2(, w, a) is positive definite. It will be shown below that o> must 
be finite. We shall now show that a>» is a characteristic root. The functional 
J2(n, w, 79) must be positive since otherwise there would exist an admissible 
arc w, such that J2(n, w, <0 for and hence for <g> and suffi- 
ciently near to oo, which is not the case. If the functional J2(n, w, oo) were 
positive definite then by Theorem 8:2 the condition V’ would hold for this 
functional. By the use of Lemma 3:2 applied to sub-intervals of the form 
x'x* and x°x* and with the help of well known theorems on quadratic forms it 
could then be shown that the condition V’ would hold for the functional 
J.(n, w, «) for values of o slightly larger than oo. The functional J2(n, w, 0) 
would then be positive definite for these values of ¢, by Theorem 8:2, and ao 


1934] THE PROBLEM OF BOLZA 813 


could not be the least upper bound for such values of c. It follows that there 
exists at least one admissible arc 7;, w, with (7)4(0) on x'x? such that 
J2(n, w, oo) =0. As in the proof of Lemma 3:4 it is seen that this arc has asso- 
ciated with it a set of multipliers w»=1, u(x) such that the functions 7;, us 
define a secondary extremal for the problem of minimizing the functional 
J2(n, w, 70) in the class of admissible variations ;, w, belonging to g. The 
functions 7;, ug.therefore form a characteristic solution and a» a characteristic 
root. 

In order to show that a» is finite we note that there exists at least one set 
of admissible variations 7;, w, having (yn) #(0) on x'x? since the accessory 
minimum problem can be made normal. For this set the functional J2(n, w, o) 
can be made negative by taking o sufficiently large and positive. Conse- 
quently o> must be finite. This proves the theorem. 

9. Sufficient conditions for relative minima. The end conditions (1:3) 
are said to be regular on the admissible arc g under consideration if the matrix 
of the derivatives of the functions x*(a), y;#(a@) has rank r for (a) =(0). The 
arc g is said to satisfy the non-tangency condition if the manifold y,* = y;(x*) 
(s=1, 2) and the terminal manifold x* =x*(a), y;* =y,*(a) possess no common 
tangent line at the point (a) = (0) on the terminal manifold. The end condi- 
tions are regular and the non-tangency condition holds on g if and only if 
the secondary end conditions (3:2) are regular on g (VIII, pp. 525-6). No 
generality is lost in assuming that the end conditions are regular and the non- 
tangency condition holds on g, as can be seen from the proof of Theorem 9:2 
below. 

The symbol I will be used to denote the necessary condition of Theorem 
2:1. An admissible arc g with a set of multipliers Ao, Ag(x) is said to satisfy the 
Weierstrass condition II’x if at each element (x, y, y’, A) in a neighborhood % 
of those on g the inequality 


E[x, ys A, Y’| > 0 


holds for every admissible set (x, y, Y’) #(x, y, y’). The Clebsch condition 
III’ and the conditions IV’, V’, VI’ have been described in §§7 and 8. The 
last three conditions can readily be expressed in terms of the extremal family 
in a manner analogous to that given by Bliss (IX, pp. 265-6; cf. XVIII, p. 
483). 


THEOREM 9:1. Let g be an admissible arc for the problem of Bolza with sepa- 
rated end conditions. Suppose that the end conditions (1:4) are regular and that 
the non-tangency condition holds on g. If g has no corners and satisfies the condi- 
ditions I, IIg’, III’, IV’ with a set of multipliers \xo=1, Xs(x), then g affords a 
proper strong relative minimum to the functional J. 


814 M. R. HESTENES [October 


From the conditions I and III’ we conclude that g is a non-singular ex- 
tremal since it has no corners (V, p. 735). The theorem will now be estab- 
lished by showing that the hypotheses of Theorem 6:1 are fulfilled. 

As a first step we note that g is a member for values x} <x <2’, a;=<a0 
(i=1, - - - , m) of an m-parameter family of extremals whose equations in the 
canonical variables x, y;, z:=F,,, are of the form 


(9:1) Yi = Wil%, 1, On), = Gn). 


The functions ¥;, yiz, 2:, 22 have continuous first and second derivatives for 
all values (x, a) in a neighborhood of those belonging to g. The parameters (a) 
can be chosen so that along g 


(9:2) Via,(*, do) = Uix(x), a) = Vie(x), 


where the functions U;x, Vi, are secondary extremals having the properties 
described in Lemma 7:1. Moreover the family of extremals (6:1) defines a 
Mayer field over a neighborhood § of g. The proof of the existence of such a 
family is like that given by Bliss and Hestenes (XVII, pp. 322-3) and by 
Morse (VII, p. 576) with help of Lemma 7:1. Let pi(x, y), Xe(x, y) be the 
slope functions and the multipliers of the field. It is clear that the field § 
can be taken so small that the elements [x, y, p(x, y), A(x, y) ] will lie in the 


neighborhood § specified by the condition IIm’. The inequality (6:4) then 
holds at each point in §. 
The identity (6:2) follows at once from the transversality condition (2:2). 
In order to show that the quadratic form (6:3) is positive definite on g it is 
convenient to express /* in terms of the variables x, a;, - - - , @, instead of the 
variables x, - - , In doing so we replace the functions p;(x, y), 
by the functions y;2(x, a), \s(x, @), where \g(x, a) are the multipliers belong- 
ing to the family (9:1). We use the following abbreviations: 
= 62; = dy; = Yizdu + byi, 
= Yizedx + = yird*u + dyidxu + dyidx + 674;. 
With the help of the Euler-Lagrange equations (2:3) it is found that 
d(Fy;') = Fydx + 62;. 
Hence at the initial point 1 on g we have 
— + = — (F — y{Fy,)dx — + 
— + do! = — (F — yiFy,,)d*x — — (Fe — yiFy,)(dx)? 
— dyidx — b26y; + 


1934] THE PROBLEM OF BOLZA 815 


If now we replace the first and second differentials of x and y; by their values 
in terms of the differentials of a, it is found with the help of equations (3:5), 
(9:2), and Lemma 7:1 that the inequality 


— = é,,* da,da, 6,62; >0 (u, 1, p) 


holds at the point 1 on g for every set (dax, da,) ~(0, 0) satisfying the condi- 
tions dy,(x') =c;}da,. By a similar argument it can be shown that at the final 
end point 2 on g the inequality 


— = — b,,2da,da, + > 0 


holds for every set (daz, da,) ~ (0, 0) satisfying the equations 5y;(x*) =c¢,,7da,. 
The last two inequalities show that the quadratic form (6:3) is positive defi- 
nite on g. Theorem 6:1 now justifies the theorem that was to be proved. 

Theorem 9:1 will now be used in order to obtain sufficient conditions for 
the problem of Bolza with end conditions of the type (1:3). In the following 
theorem it should be noted that the assumptions of regularity of the end con- 
ditions and of non-tangency are not needed. 


THEOREM 9:2. If an admissible arc g without corners satisfies the conditions 
I, IIg’, III’ with a set of multipliers \o=1, Xs(x) and if the second variation 
J2(n, w) is positive definite along g, then g affords a proper strong relative mini- 
mum to the functional J. 


In order to prove the theorem we note first that a problem of Bolza with 
end conditions of the form (1:3) is equivalent to the problem of finding in 
the class of arcs 


and sets aa, (k=1, - - -, satisfying the differential equations and end 
conditions 


Vi, = 0, Yn +h = 0 L n;h=1,---,7), 
= yi = yH(a), Yath' = ap, 
a? = x(y), ? = y#(y), = Vr 


one which minimizes the functional J. This equivalence follows from the fact 
that the functions y,,,(x) are all constants and hence take the values a,=7Y2 
at x=x' and x=’. The new problem is a problem of Bolza with separated 
end conditions and will be called the transformed problem. Let g, be the ad- 
missible arc for the transformed problem which corresponds to the arc g of 
the theorem. It is easily seen that the new end conditions are regular and 
that the non-tangency condition holds on g:. The arc g: also satisfies the con- 


816 M. R. HESTENES [October 


ditions I, II’, III’ for the transformed problem with the set of multipliers 
=1, As(x), Am+n(x), Where the multipliers \,.4,(x) are constants determined 
by the transversality condition (2:2). Moreover there is one-to-one corre- 
spondence between the admissible variations for the two problems and along 
corresponding admissible variations the values of the second variation for the 
two problems are the same. Theorems 8:1 and 9:1 therefore tell us that g, 
furnishes a proper strong relative minimum for the transformed problem and 
hence that g furnishes a proper strong relative minimum for the original prob- 
lem, as was to be proved. 
From Theorems 8:2 and 9:2 we obtain the following result: 


THEOREM 9:3. If an admissible arc g without corners satisfies the conditions 
I’, Ig’, III’, V’ with a set of multipliers \o=1, Xg(x), then g affords a proper 
strong relative minimum to the functional J. 


Combining Theorems 8:3 and 9:2 we obtain the further 


THEOREM 9:4. The results of Theorem 9:3 remain valid when the condition 
V’ is replaced by the condition VI’ provided that the order q of anormality of g 
is the same on every sub-interval x'x° of the interval x\x* determined by the end 


points of g. 

The last two theorems are extensions of the sufficient conditions given by 
Bliss (IX, p. 271). The following theorem gives an extension of the sufficient 
conditions given by Morse (VIII, p. 528) and Hu (XIX, p. 417) and is ob- 
tained by combining Theorems 8:4 and 9:2. 


THEOREM 9:5. Suppose that the end conditions are regular and that the non- 
tangency condition holds on an admissible arc g having no corners. If g satisfies 
the conditions 1, II’, III’ with a set of multipliers y=1, Xa(x) and if the charac- 
teristic roots of the accessory boundary value problem are all positive, then g af- 
fords a proper strong relative minimum to the functional J. 


Sufficient conditions for a weak relative minimum can be obtained in the 
usual manner by omitting the condition IIy’ in the above theorems (V, pp. 
736-7). 

The following example shows clearly that the sufficient conditions here 
given actually do not imply the normality relations which we have proposed 
to exclude. Let 4 be a small positive constant and let A(x), B(x) be functions 
satisfying the conditions 


x2 


A(x) A(x) =Oone' +h Ex 2x’, 
= 


Bix) >Oonx?-—h<x 


B(x) = 0 on x! 


1934] THE PROBLEM OF BOLZA 817 


and having continuous derivatives of the first three orders. The segment g of 
the x-axis between x! and x? then furnishes a proper strong minimum to the 
integral 


r= 


in the class of arcs (1:1) with n=4 satisfying the differential equations 
ye +Al(x)n, ys + yi =0 


and joining the two fixed points (x, y) =(x!, 0) and (x, y) =(«?, 0). The order 
p of anormality of g is readily found to be unity. The order qg of anormality of g 
is unity on every sub-interval x’x”’ satisfying the conditions x' <x’ <x'+h, 
x’?—h<x'' <x*. If one of these conditions holds, then g=2. If neither holds, 
then g=3. It follows that the sufficient conditions given heretofore are not 
applicable to g. However g satisfies the sufficient conditions here given with 
the set of multipliers \)=1, A(x) =0, except for those in Theorem 9:4. 


I. Hadamard, Legons sur le Calcul des Variations, 1910. 

II. Bolza, Uber den “anormalen Fall” beim Lagrangeschen und Mayerschen Problem mit gemischten 
Bedingungen und variablen End punkten, Mathematische Annalen, vol. 74 (1913), pp. 430-446. 

III. Bliss, The problem of Mayer with variable end points, these Transactions, vol. 19 (1918), pp. 
305-314. 

IV. Bliss, A boundary value problem of the calculus of variations, Bulletin of the American Mathe- 
matical Society, vol. 32 (1926), pp. 317-331. 

V. Bliss, The problem of Lagrange in the calculus of variations, American Journal of Mathematics 
vol. 32 (1930), pp. 673-744. 

VI. Morse and Myers, The problems of Lagrange and Mayer with variable end points, Proceedings. 
of the American Academy of Arts and Sciences, vol. 66 (1931), pp. 235-253. 

VII. Morse, Sufficient conditions in the problem of Lagrange with fixed end points, Annals of 
Mathematics, (2), vol. 32 (1931), pp. 567-577. 

VIII. Morse, Sufficient conditions in the problem of Lagrange with variable end points, American: 
Journal of Mathematics, vol. 53 (1931), pp. 517-596. 

IX. Bliss, The problem of Bolza in the calculus of variations, Annals of Mathematics, (2), vo!. 33 
(1932), pp. 261-274. 

X. Bliss and Schoenberg, On the derivation of necessary conditions for the problem of Bolza, 
Bulletin of the American Mathematical Society, vol. 38 (1932), pp. 858-864. 

XI. Myers, Adjoint systems in the problem of Mayer under general end conditions, Bulletin of the: 
American Mathematical Society, vol. 38 (1932), pp. 303-312. 

XII. Currier, The variable end point of the calculus of variations including a generalization of the: 
classical Jacobi conditions, these Transactions, vol. 34 (1932), pp. 689-704. 

XIII. Graves, On the Weierstrass condition for the problem of Bolza in the calculus of variations, 
Annals of Mathematics, (2), vol. 33 (1932), pp. 747-752. 

XIV. Reid, A boundary value problem associated with the calculus of variations, American Journal 
of Mathematics, vol. 54 (1932), pp. 769-790. 

XV. Carathéodory, Die Theorie der zweiten Variation beim Problem von Lagrange, Sitzungs-- 
berichte der Bayerischen Akademie der Wissenschaften, 1932, pp. 99-114. 

XVI. Carathéodory, Ueber die Einteilung der Variationsprobleme von Lagrange nach Klassen, 
Commentarii Mathematici Helvetici, vol. 5 (1933), pp. 1-10. 


818 M. R. HESTENES 


XVII. Bliss and Hestenes, Sufficient conditions for a problem of Mayer in the calculus of variations, 
these Transactions, vol. 35 (1933), pp. 305-326; Contributions to the Calculus of Variations 1931- 
32, The University of Chicago Press, pp. 295-337. 

XVIII. Hestenes, Sufficient conditions for the general problem of Mayer with variable end points, 
these Transactions, vol. 35 (1933), pp. 479-490; Contributions to the Calculus of Variations 1931-32, 
The University of Chicago Press, pp. 339-360. 

XIX. Hu, Problem of Bolza and its accessory boundary value problem, Contributions to the Calcu- 
lus of Variations 1931-32, The University of Chicago Press, pp. 361-443. 

XX. Bower, The problem of Lagrange with finite side conditions, Dissertation, The University of 
Chicago, 1933. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


GROUPS IN WHICH THE SQUARES OF THE 
ELEMENTS ARE A DIHEDRAL SUBGROUP* 


BY 
G. A. MILLER 


1. The dihedral subgroup is non-abelian. If a group G has the property 
that the squares of its operators constitute a given group then the direct 
product of Gand any abelian group of order 2” and of type (1, 1,1, ---) 
has the same property. Such direct products will always be excluded in what 
follows. It has been noted that a necessary and sufficient condition that there 
is at least one group G which has the property that the squares of its opera- 
tors constitute a given non-abelian dihedral group H is that the order h of H 
is not divisible by 8 and that every odd prime number which divides h is 
congruent to unity modulo 4.f To determine the number of these groups when 
h is given and satisfies these conditions we let » represent the number of the 
different prime numbers which divide h. Since the cyclic subgroup K of index 
2 contained in 7 is the direct product of its Sylow subgroups and all the 
groups of isomorphisms of these Sylow subgroups are cyclic it results that 
when / is twice an odd number there is one and only one group of order 


h-2"-? which satisfies the condition that the squares of its operators con- 
stitute K. 


The given group of order /-2"-? is the largest group which has the prop- 
erty that the squares of its operators constitute K when &/ is twice an odd 
number, as will be assumed until the contrary is explicitly stated. It is 
merely the direct product of »—1 dihedral groups, each of order twice the 
power of an odd prime. The commutator subgroup of every G is K and the 
corresponding quotient group is the abelian group of type (2, 1,1, - - - ). This 
results directly from the fact that if two operators of G have squares which 
are contained in K then the square of their product is also contained therein 
since this square could not be an operator of order 2 in H in view of the fact 
that this product could not transform this cyclic subgroup according to an 
operator of order 4. 

It should be emphasized that an operator of G which transforms some of 
the operators of the cyclic subgroup of index 2 in H according to an operator 
of order 2 does not necessarily transform all the operators of this subgroup, 
besides the identity, according to such an operator, but that every operator 

* Presented to the Society, September 7, 1934; received by the editors April 13, 1934. 

¢ G. A. Miller, Proceedings of the National Academy of Sciences, vol. 20 (1934), p. 129. 

819 


820 G. A. MILLER [October 


of G which transforms an operator of this cyclic subgroup according to an 
operator of order 4 transforms each of these operators besides the identity, 
according to such an operator. Hence G involves a subgroup of index 2 com- 
posed of all of its operators which transform the operators of the given cyclic 
subgroup either into themselves or into their inverses. Each of the remaining 
operators of G is of order 4 and transforms the operators of odd order in H 
according to one of the 2"~! ways in which these operators can be transformed 
when they are transformed according to an operator of order 4. Since such 
an operator and its inverse transform the operators of odd order differently 
it results that we have to consider only 2"~? of these different possible trans- 
formations. 

The largest possible order of G is h-2"~' and there is one and only one G 
of this order. It involves as a subgroup of index 2 the given group of order 
h-2"-? composed of all the operators whose squares are the operators of odd 
order in H. To determine all the possible G’s it is desirable to note the differ- 
ent possible subgroups composed of the operators whose squares constitute 
the operators of odd order in H. If such a subgroup is of order h-2"-?-* the 
number of the possible G’s which involve it is 2* since there are 2* sets of 
operators of order 4 which can be added to it to obtain a G having the re- 
quired properties and each of these sets transforms the operators of odd order 
in H in a different way. The number of the possible subgroups of order 
h-2"-*- has been determined recently* and hence there results the following 
theorem: 


The number of the groups which involve a given dihedral group whose order 
is twice an odd number as the group of the squares of their operators, when the 
number of the different prime factors of the order of this dihedral group is n and 
each of these odd factors is congruent to unity modulo 4 is equal to the sum of the 
indexes of all the subgroups, including the identity and the entire group, of the 
abelian group of order 2"-? and of type (1,1, 1, - - - ) under this group. 


It was noted above that the operators of odd order in H constitute the 
commutator subgroup of G. This fact can also be established by noting that 
every such group can be represented as a transitive substitution group whose 
degree is equal to the order of K, since its Sylow subgroup whose order is a 
power of 2 does not involve any invariant subgroup of G besides the identity. 
This follows from the fact that direct products are excluded. Hence it re- 
sults that each of the groups determined above is contained in the holomorph 
of K. The subgroup composed of all the substitutions which omit a fixed let- 
ter of this holomorph is therefore in the group of isomorphisms of K. Since 


* G. A. Miller, Proceedings of the National Academy of Sciences, vol. 20 (1934), p. 203. 


1934] A CERTAIN CLASS OF GROUPS 821 


the group of isomorphisms of a cyclic group is abelian it results that the 
Sylow subgroups whose orders are a power of 2 in such a G are always abelian 
and hence all of these Sylow subgroups are of type (2, 1, 1, - - - ). 

The number of these Sylow subgroups is 4/2 and no two of them have an 
operator of order 4 in common. A necessary and sufficient condition that no 
two of them have an operator of order 2 in common is that the order of such 
a G is 2h. The number of the groups of this order is 2"~? and all of them are 
conformal. Every other G contains more than one of these conformal groups. 
In fact, if the order of such a G is 2*-h it contains 2*—! of these groups. The 
number of the possible G’s increases very rapidly with the increase of m. In 
particular, when »=5 there are 51 such groups; viz., 8 of order 2h, 28 of 
order 4h, 14 of order 8h, and one of order 164. This number depends only on 
the number of the distinct prime numbers which divide / and is independent 
of the values of these primes. 

It remains to determine the possible groups when / is four times an odd 
number and each of the odd prime factors of / is again congruent to unity 
modulo 4. None of these groups is contained in the holomorph of K, since 
half of the operators of H are negative when it is thus represented. Each of 
the two dihedral subgroups of index 2 contained in H is invariant under G 
since all the operators of G which are either commutative with every operator 
of K or transform some of these operators into their inverses constitute a 
subgroup of index 2 under G. Each of the remaining operators of G is of 
order 4 and has for its square a non-invariant operator of order 2 in H. Since 
these operators cannot transform the two given dihedral subgroups of H 
into each other none of the operators of G can have this property and there- 
fore each of these dihedral subgroups is invariant under G. 

The commutator subgroup of G is again composed of the operators of odd 
order in K. Hence it results that all the operators of G whose squares appear 
in one of the two dihedral subgroups of index 2 in H constitute a subgroup 
of index 2 under G. That is, every such G contains as a subgroup of index 2 one 
of the groups enumerated above which bas the property that the squares of its 
operators are a dihedral group whose order is twice an odd number. To ex- 
tend one of the given groups so as to obtain the desired result we may repre- 
sent it as a regular substitution group and make it simply isomorphic with 
itself represented on a different set of letters so as to obtain an intransitive 
substitution group. To this we may adjoin a substitution of order 4 which is 
commutative with every substitution of this intransitive group, interchanges 
its two systems of intransitivity and has for its square the invariant substitu- 
tion of order 2 contained in H. Each such G contains two subgroups of index 2 
such that the squares of their operators are the dihedral subgroups of index 2 


ae 
4 
¥ q 
# 
& 
if 


822 G. A. MILLER [October 


in H, and one such subgroup such that the squares of its operators constitute 
the cyclic subgroup of index 2 in H. The cross-cut of these three subgroups 
is the subgroup of index 4 composed of the operators of odd order in H. 

From the preceding paragraph it results that G involves an invariant 
cyclic subgroup of order / and is contained in the holomorph of this cyclic 
subgroup. Its operators whose squares are the non-invariant operators of 
order 2 in H transform the operators of this cyclic subgroup into powers which 
are congruent to unity modulo 4 as otherwise these squares of operators of 
order 4 would not give all the non-invariant operators of order 2 in H. It 
therefore results that every such G contains an operator of order 4 which is 
commutative with each of its operators and hence there results the following 
theorem: 


Each group in which the squares of the operators constitute a dihedral group 
whose order is twice an odd number is a subgroup of index 2 under one and only 
one group in which the squares of the operators constitute a dihedral group whose 
order is four times an odd number and the number of distinct groups in both of 
these cases is the same. 


2. The dihedral subgroup is the four group. The special case when the 
operators which are the squares of the operators of a given group G constitute 


the four group is much more difficult than the more general case when these 
squares constitute a non-abelian dihedral group. The only abelian group 
which comes under this special case is the group of order 16 and of type (2, 2), 
and the order of every other group which comes thereunder is obviously also 
of the form 2”. The commutator subgroup of every such non-abelian group 
is either of order 2 or of order 4. We shall first consider the former case and 
hence the operators of order 2 contained in G together with its operators of 
order 4 whose squares are equal to the commutator of order 2 constitute a 
subgroup of index 2 under G. This subgroup belongs to one of the three known 
infinite categories of groups involving separately two and only two operators 
which are squares. Its central involves at least three invariant operators of 
order 2 and at most seven such operators since G is supposed to have the 
property that it is not a direct product. 

When the central of this subgroup involves only three operators of order 
2 there are two such groups of order 2™ when m is odd and exceeds 3. These 
are the direct products of the cyclic group of order 4 and of the groups which 
involve only two operators which are squares but do not contain an invariant 
operator of order 4. There are also two such groups of order 2™ when m is 
even and exceeds 4. One of these two groups is also the direct product of the 


1934] A CERTAIN CLASS OF GROUPS 823 


cyclic group of order 4 and a non-abelian group which involves only two 
operators which are squares but involves an invariant operator of order 4, 
while the other is obtained by extending such a non-abelian group by an 
operator which does not transform into itself its invariant operator of order 4. 
When the central of the given subgroup of index 2 involves seven operators 
of order 2 there are two additional such groups when m is even and there is 
one such additional group when m is odd. Hence there results the following 
theorem: 


There are four groups of order 2”, m being even and larger than 4, which 
satisfy the condition that each of them has the four group for the group of its 
squares and involves a commutator subgroup of order 2. When m is odd and larger 
than 5 there are three such groups. 


It remains to consider the case when the commutator subgroup of G is 
the same as the group of the squares of its operators and we shall first con- 
sider the special case when G involves an abelian subgroup of index 2. This 
subgroup is of one of the following three types:(1,1,1,---),(2,1,1,---), 
(2, 2, - - - ) and G involves only one abelian subgroup of this index. There is 
one and only one G which involves such a subgroup of the first of these three 
types. It is of order 32 and involves 12 operators of order 4. When this abelian 


subgroup is of the second type the order of G is either 32 or 64. In the former 
case there is one such G. This involves 20 operators of order 4. In the latter 
case there is also one and only one suchG. This contains 40 operators of order 
4. It remains to consider the case when the abelian subgroup of index 2 is of 
type (2, 2, 1,1, - - - ) and hence the order of G is 32, 64, or 128. 

In the first case there are two groups in which all of the remaining opera- 
tors are of the same order. These are the generalized dihedral and the general- 
ized dicyclic groups. When only four of the operators of the given abelian sub- 
group of index 2 are transformed into their inverses under G there are also 
two groups of order 32. In one of these each of the remaining operators is of 
order 4 while only half of these operators are of this order in the other. There 
is one additional such group of order 32 in which no one of the operators of 
order 4 in the given abelian subgroup of order 16 is transformed into its in- 
verse under G. This group involves 24 operators of order 4 of which eight have 
a common square. When G is of order 64 it involves invariant operators of 
order 4 and there are two isomorphisms to be considered. One of these gives 
rise to two distinct groups while the other gives rise to only one group. There 
is obviously only one group of order 128 which involves this abelian subgroup 
of index 2 and hence the following theorem has been established: 


" 
Te 
iy 
‘ 
‘a 
a 


824 G. A. MILLER [October 


There is one and only one group which satisfies the conditions that it involves 
the abelian group of type (1, 1,1, - - -) as a subgroup of index 2 and that the 
squares of its operators as well as its commutator subgroup constitute the four 
group. There are two such groups which involve the abelian group of type 
(2,1, 1, +--+) as such a subgroup, and there are nine such groups which involve 
the abelian group of type (2, 2,1, 1, - - - ) as such a subgroup. 


The most difficult case remains, viz., the one when G contains no abelian 
subgroup of index 2 and when the commutator subgroup of G coincides with 
the group of its squares. All these possible groups may be divided into three 
categories composed of those whose centrals are of order 4, 8, or 16 respec- 
tively. These centrals are of types (1, 1), (2, 1) and (2, 2) respectively. For 
each of these categories it is possible to construct an infinite system of groups 
such that every operator which does not appear in the central has four con- 
jugates under the group. To do this we may start with any abelian group 
whose order is four times the order of the central and whose squares appear 
in the four group contained therein. The group thus obtained is then extended 
twice successively by two operators which are relatively commutative and 
are commutative only with the operators of the given subgroup which appear 
in the central, and whose product has the same property. The order of the 
group thus obtained is sixteen times the order of its central, and each of its 
own invariant operators has four conjugates under it. This group can be ex- 
tended successively by two operators which have their squares therein and 
are commutative with each other and with each operator of the given group. 
The resulting group can be extended as before. By continuing this process we 
obtain a group whose order is an arbitrary power of 16 times the order of the 
central and all of whose operators which do not appear in this central have 
four conjugates under the group. 

The lowest order of a group G which belongs to the infinite system de- 
scribed in the preceding paragraph is 64. This is also the lowest order of G 
whenever it does not involve an abelian subgroup of index 2. If such a G is 
of order 64 and all of its operators except those which are squares have four 
conjugates under G, then every such operator appears in an abelian subgroup 
of order 16 and G contains exactly five such subgroups. These subgroups have 
the central of G in common but no two of them have any other operator in 
common. There is one such group which involves two abelian subgroups of 
order 16 and of type (1, 1, 1, 1). The other three abelian subgroups of order 16 
contained therein are of type (2, 2). When there is one and only one such sub- 
group in G there is also a subgroup of type (2, 2). Hence there is only one 


1934] A CERTAIN CLASS OF GROUPS 825 


such G. It also contains exactly 27 operators of order 2. There is one and only 
one G in which there is no abelian subgroup of type (1, 1, 1, 1). It contains 
only eleven operators of order 2. This proves the following theorem: 


There are three and only three groups of order 64 which separately satisfy 
the following conditions: the group of their squares and their commutator sub- 
group ts the four group and each of the operators which is not in this four group 
has four conjugates under the group. 


UNIVERSITY OF ILLINOIS, 
Urpana, ILL. 


a 
& 
4 
P 
4 


A PROJECTIVE GENERALIZATION OF METRICALLY 
DEFINED ASSOCIATE SURFACES* 


BY 
M. L. MacQUEEN 


1. INTRODUCTION 


In the metric differential geometry of surfaces in ordinary space, two sur- 
faces are said by Bianchi to be associatet if the tangent planes at correspond- 
ing points are parallel and if the asymptotic curves on either surface corre- 
spond to a conjugate net on the other. 

It is the purpose of this paper to develop a projective generalization of the 
relation of associateness of surfaces. Since associate surfaces are parallel in 
the metric sense, it will first be necessary to provide a projectively defined 
substitute for the property of metric parallelism. We shall employ as the basis 
of our study in this paper a projective generalization of euclidean parallelism 
of surfaces which the author has developed in his Chicago doctoral disserta- 
tion. 

In §2, after stating a definition of projective parallelism of surfaces and 
briefly explaining this idea, we introduce a canonical form of our system of 
differential equations employed in the study of projectively parallel surfaces 
in ordinary space. In §3 we formulate a definition of projectively associate 
surfaces and investigate to some extent their properties and relations. A more 
general type of associateness which may be conveniently termed modified 
projective associateness is introduced in §4, and a somewhat different canonical 
form of our system of differential equations is employed in its study. Finally, 
in §5, we consider a rather general completely integrable system of partial 
differential equations, namely, the system for two surfaces in the general 
analytic one-to-one point correspondence in ordinary projective space S3, and 
a group of transformations that leaves this configuration invariant. We then 
reduce this system of equations to a new canonical form, and employ it to 
continue briefly the study of modified projective associateness introduced in 
the preceding section. 


* Presented to the Society, September 7, 1934; received by the editors April 8, 1934. 
t Eisenhart, Differential Geometry, Ginn and Company, 1909, p. 378. Hereinafter cited as Eisen- 
hart. See also Bianchi, Lezioni di Geometria Differenziale (3d edition), vol. 2, p. 10. 


826 


ASSOCIATE SURFACES 


2. PROJECTIVE PARALLELISM OF SURFACES 


In formulating a projective generalization of metric parallelism of sur- 
faces,* we begin by replacing the metric normal congruence by the projective 
normal congruence, and so consider two surfaces S,, S,, in ordinary projective 
space S;, with a common projective normal congruence. The developables of 
this congruence intersect both surfaces in the projective lines of curvature, 
which form conjugate nets. We then demand, in analogy to the metric paral- 
lelism of the tangent planes, that the tangent planes at corresponding points 
of the two surfaces intersect on a fixed plane. Two surfaces so related are said 
to be parallel in the projective sense. 

For the basis of our study of projective parallelism we employ one of the 
well known transformations of surfaces, namely, the fundamental trans- 
formation. Two surfaces are said to be in the relation of a fundamental 
transformation, or transformation Ff, in case their points are in a one-to-one 
correspondence such that the lines joining corresponding points form a con- 
gruence whose developables intersect both surfaces in conjugate nets, neither 
surface being a focal surface of the congruence. The congruence is called the 
conjugate congruence of the transformation because it is conjugate to both 
nets. The tangent planes at corresponding points of the two surfaces intersect 
in the lines of the harmonic congruence of the transformation, which is har- 
monic to both nets. By choosing the projective normal congruence as the com- 
mon conjugate congruence of the transformation F, we provide, as previously 
stated, a projective substitute for the metric normal congruence. Further- 
more, we assume that the developables of the harmonic congruence are inde- 
terminate, that is, that corresponding tangent planes of the two surfaces in- 
tersect in the lines of a fixed plane. This assumption affords us a projective 
substitute for the metric parallelism of the tangent planes. Our definition of 
projective parallelism may now be stated in the following way: 


Two surfaces S,, S, are said to be projectively parallel in case they are in the 
relation of a fundamental transformation with the projective normal congruence 
as the conjugate congruence and with the developables of the harmonic congruence 
indeterminate. 


In order to represent analytically the definition which we have introduced, 
let us consider two projectively parallel surfaces S,, S, with the respective 
parametric vector equations 


= x(u,0), y = y(u, 2). 


*M.L. MacQueen, A Projective Generalization of Euclidean Parallelism of Surfaces, University 
of Chicago, December, 1933; unpublished doctoral dissertation. Hereinafter cited as Thesis. 
t L. P. Eisenhart, Transformations of Surfaces, Princeton University Press, 1923, p. 34 et seqa 


+ 

827 

i 

he 

¥ 

a 


828 M. L. MacQUEEN [October 


The four coordinates x and the four coordinates y form four pairs of solutions 
of a completely integrable system of differential equations of the form 

Suu = px + ax, + Bx, + Ly, 

Lue = cx + ax, + 

Lov = gx + yx, + x, + Ny, 

Yu = fu + mx, + Ay, 

Yo = gu + nx, + By (mnLN # 0), 
where the notation here employed is similar to that used by Lane in his recent 
book.* 

Before stating the conditions which characterize this system we remark 
that in order to treat S,, S, in a symmetrical manner we see that x, y satisfy 
a system of equations of the form (1), but with the roles of x and y inter- 
changed. The coefficients of such a system will be indicated by dashes and will 
be given later. In order that S, may be non-developable we shall suppose that 
LN +0. 

System (1) is characterized analytically by the following conditions: 
(a) a+b+A + (log NV), — 3(log r),/2 — 2(log = 0, 
(b) y/r + a + (log r)./2 = 0, 
(c) = nr/m, 
(d) f/m = — [log (mn)"*R/L]., 
(e) m(1 — n)B’? + nr(1 — m)C’? + m,(B’ + m,/(4m)) 
+ nur(C’ + n,/(4n)) = 0 


and by the counterpart of (a), (b), and (d) in the substitution 


bec gngqtbiyB N M rR} 


(2) 


(3) 


The invariants 6’, ©’, R of Green, and the invariant r of Eisenhart, appearing 
in equations (2), are expressed for the projective lines of curvature on S, in 
terms of the coefficients of system (1) by the following formulas: 

4a + 2NB/L — 26 + (log N/L)., 

4b + 2Ly/N — 2a + (log L/N)., 
on LB’?/N 4 ¢’?, 
= N/L. 


* Lane, Projective Differential Geome‘ry of Curves and Surfaces, University of Chicago Press, 
1932, p. 183. Hereinafter cited as Lane. 


(4) 


1934] ASSOCIATE SURFACES 829 


Conditions (2) (a) imply that the line x,«, is the reciprocal of the projec- 
tive normal of S, at P.,, and conditions (b) and (a) imply that the line xy 
is the projective normal of S, at P,; condition (c) implies that the tangent 
planes at corresponding points of S,, S, intersect in the lines of a fixed plane; 
conditions (c) and (d) imply that the line y,y, is the reciprocal of the projec- 
tive normal of S, at P,, and conditions (c), (d), and (e) imply that the line 
xy is the projective normal of S, at Py. 

It may be remarked that the choice of the proportionality factors which 
leads to our canonical form is precisely that which gives Fubini’s normal co- 


ordinates. 
The integrability conditions for system (1) are given by the following 
equations and those obtainable therefrom by the substitution (3): 
a, + ab+c¢ =a, + fy, 
b, + b? + aB = B, + ba + B65 + p+ aL, 
Cu t+ bc + ap =p, +ca+ gh + gl, 
8utocn+ fB = fy, + cm + gA, 
an + mB + g = m, + am, 
aL=L,+BL+B8N, By. 
The coefficients of the equations corresponding to (1) when the roles of 
x and y are interchanged are given by 
= A, + mL — A(m,/m + f/m + a) — mBB/n, 
a + f/m + m,/m + A, B = mB/n, 
— m(af/m + Bg/n + (f/m)? — p — (f/m)u), 
A, — A(m,/m + a) — B(f/n + mb/n), 
= By, — B(n,/n + b) — A(g/m + na/m), 
=a+m,/m = B+ g/m+ na/m, 
b+ n,/n = A + f/n + mb/n, 
B,+nN — B(n,/n + g/n + 6) — nyA/m, 
ny/m, B, 
— n(dg/n + yf/m + (g/n)? — q — (g/n)e), 
— A/m, m = 1/m, A = — f/m, 
— B/n, n = 1/n, B = — g/n. 
The developables of the projective normal congruence intersect S, and S, 


in parametric conjugate nets which are the projective lines of curvature 
thereon, the foci P,, P; of a projective normal being given by 


th 
‘ 


830 M. L. MacQUEEN 


(7) n= y— mx, nx. 
The differential equation of the asymptotic curves on S, is 
(8) Ldu? + Ndv? = 0, 
and the asymptotic curves on S, are given by the equation 
(9) Ldu? + Ndv? = 0. 
3. PROJECTIVELY ASSOCIATE SURFACES 


The projective generalization of metric parallelism of surfaces summarized 
in the preceding section will now be employed in formulating a definition of 
projectively associate surfaces. In analogy to the metric definition of asso- 
ciate surfaces, two surfaces S., Sy, in ordinary projective space, will be called 
projectively associate if they are projectively parallel and if the asymptotic curves 
on either surface correspond to a conjugate net on the other. 

A necessary and sufficient condition that the asymptotic curves on one 
of two projectively parallel surfaces S,, S, correspond to a conjugate net on 
the other is 
(10) IN + LN = 0, 

i.e., the harmonic invariant of the asymptotic curves on the two surfaces 
vanishes. With the aid-of (2) (c) this condition is seen to be equivalent to 

(11) 

By means of (7), condition (11) shows that P, is the harmonic conjugate of P, 
with respect to the two focal points of a projective normal. Thus we reach the 
following conclusion: 


If two surfaces S., S, are projectively parallel, a necessary and sufficient 
condition that they be projectively associate is that corresponding points on a 
projective normal separate harmonically the foci thereon. 


The Laplace-Darboux point invariants, H, K, the Weingarten invariants 
W™, W, and the tangential invariants $, R are given for the projective 
lines of curvature on S, in terms of the coefficients of system (1) by the 


formulas 
H =c+ab— au, K =c+ab— 


W™ = 2b, + a, — 6, — By — (log L) us, 
W) = 2a, + b, — a, — Ay — (log NV) us, 
K+W™ = a, + By — B, — (log L)us, 
= N(B. + 6b — BA — L).)/L, 
H+ W = b, + By — Ay — (log N) uz, 
= L(y. + ya — yB — y(log N).)/N. 


[October 


1934] ASSOCIATE SURFACES 831 


The corresponding invariants, indicated by dashes, for the projective lines 
of curvature on S,, projectively parallel to S,, are found* to have the follow- 
ing expressions: 


H = H — (log m*n),,./2, K = K — (log mn'),,/2, 
(13) W™ = W™ + (log mn'),,/2, W = W) + (log mn) 
= §, K = RK. 
It is evident from (2) (d) that 
(14) (f/m)» = (g/n)u. 


By using (14) and the integrability conditions (5) a simple calculation is made 
which shows that in case m= —n it follows that a,=b,, and the projective 
lines of curvature on S, have equal point invariants. Moreover, in this case 
equations (13) show that the projective lines of curvature on S, also have 
equal point invariants. We therefore reach the following conclusion: 


If two surfaces S,, Sy are projectively associate, the projective lines of curva- 
ture on each surface have equal point invariants. 


We shall now investigate the conjugate nets on each of two projectively 
associate surfaces to which correspond the asymptotic curves on the other. 
When use is made of (10), equation (8), which defines the asymptotic curves 
on S,, may be written 


(15) Ldu? — Ndv? = 0. 


This is the differential equation of the associate conjugate net of the projec- 
tive lines of curvature on S,, that is, the conjugate net whose tangents at each 
point of the surface S, separate harmonically the tangents to the projective 
lines of curvature. 

Similarly, by use of (10), we may write equation (9) in the form 


Ldu? — Ndv? = 0, 


which shows that the asymptotic curves on S, correspond to the associate 
conjugate net of the projective lines of curvature on S,. We may therefore 
state the following theorem: 


If two surfaces S,, S, are projectively associate, and if the parametric net on 
each is the projective lines of curvature, then the asymptotic curves on either sur- 
face correspond to the associate conjugate net of the parametric conjugate net on 
the other. 


* Thesis. 


‘ 


832 M. L. MacQUEEN [October 


An interesting property of a conjugate net is isothermal conjugacy, the 
condition for which is W“ =W™ or (log r)..=0. Let the projective lines 
of curvature on S, be an isothermally conjugate net, and let S, be projec- 
tively associate to S,. From (10) or (12) it is then easy to obtain the following 
result: 


If the projective lines of curvature are isothermally conjugate on one of two 
projectively associate surfaces, they are also isothermally conjugate on the other. 


In this case the projective lines of curvature on S, and S, are called J nets, 
since they are isothermally conjugate and have equal point invariants. 


4. MODIFIED PROJECTIVE ASSOCIATENESS OF SURFACES 


In this section we shall drop the assumption that the common conjugate 
congruence of the transformation F is the projective normal congruence, and 
shall employ in its place a general conjugate congruence. The configuration 
composed of two surfaces in ordinary space in the relation of a fundamental 
transformation having a general conjugate congruence and with the develop- 
ables of the harmonic congruence indeterminate leads us to a characterization 
of surfaces which are projectively parallel in a modified* sense. We shall use 
this type of parallelism in formulating our definition of modified projectively 
associate surfaces. 

For the analytic basis of our work a somewhat different canonical form of 
the basic system of differential equations is employed. If S,, S, are a pair of 
surfaces projectively parallel in the modified sense, then the four coordinates 
x and the four coordinates y form four pairs of solutions of a completely in- 
tegrable system of differential equations* of the form 


= Lix + y) + ax, + Bx, 
= ax, + bx,, 
= N(x + y) + yxu + 6x,, 


= Vo = NXy (muLN # 0). 


(16) 


The integrability conditions for this system are found to be 
a, + ab = a, + By, b, + ab = 6, + By, 
b, + 6? + aB = B, + ba + nL + 
a, + a? + by = + +mN + ay + N, 
L, = aL — BN, N, = bN — yL, 


m, = a(n — m), nN, = b(m — n). 


Luu 
Luv 
Lov 
* Thesis. 


1934] ASSOCIATE SURFACES 833 


The coefficients of the equations corresponding to (16), when the roles of 
x and y are interchanged, are indicated by dashes and are given by the follow- 
ing expressions: 
= a+ m,/m, 
= mb/n, 
ny/m, 
We shall assume that LN <0 in order that S, may be non-developable. 
The focal points P,, P; of a line xy joining corresponding points P,, P, 
of two surfaces S,, S, are defined by 
(19) n= y— mx, nx. 


Several results similar to those in the preceding section will now be given. 
Inasmuch as the proofs of these results run parallel to those in §3 they will 
be omitted in this section. Precisely as in the preceding section, a necessary 
and sufficient condition that the asymptotic curves on either of two modified 
projectively parallel surfaces S,, S, correspond to a conjugate net on the 
other is found to be given by the condition m= —n. Hence the following re- 
sult is readily obtained. 


If two surfaces S,, S, are projectively parallel in the modified sense, a neces- 
sary and sufficient condition that they be projectively associate in the same sense 
is that corresponding points P,, Py of each line xy separate harmonically the 
foci thereon. 


Some of the invariants of the parametric conjugate net NV, are found to 
have in our notation the following formulas: 
= ab — du, K = ab — b,, 
= 2b, + a, — 5, — (log L) uv, 
= 2a, + b, — a, — (log NV) uz, 
=K+W™ =a, + By — (log L)u» 
= N(B.u + — B(log L).)/L, 
= H+ W = b, + By — (log uv 
= L(y, + ay — y(log N).)/N, 
8B’ = 6a — 26 — 3(log L), + (log N)», 
8’ = 6b — 2a — 3(log N)u + (log L)u. 
By use of (17) and (18), the corresponding invariants for NV, , indicated by 
dashes, are given by the following expressions: 


M. L. MacQUEEN [October 


H — (log m)u», K = K — (log )u», 
= W™ + (log 2) u», = W + (log m)u», 
9, = 8, 
B’ = 8B’ + (log m3/n),, = + (log 


The following result can be easily established. 


(21) 


If two surfaces S,, S, are projectively associate in the modified sense, the 
parametric conjugate net on each surface has equal point invariants. 


The asymptotic curves on S, and S, are determined by the same equations 
as in the preceding section. Precisely as in §3 we arrive at the following result: 


If two surfaces S., S, are projectively associate in the modified sense, and 
if the parametric net on each surface is conjugate, then the asymptotic curves on 
either surface correspond to the associate conjugate net of the parametric net on 
the other. 


The Laplace transformed points, or ray points, of the point P, with re- 
spect to NV, are given by 


= — ax, = ty — 
and the ray points of P, are defined by 
= n(x, — an/m), y-1 = m(x_, — bg/n). 
The ray points of the points P,, P;, defined by (19), are found to be 


mun, = (n — m)(Hn + m,%1), (n — m)n-1 = m.f, 
(m — = nyn, = (m— n)(KE + np x1), 


where H, K are the point invariants of the net V,. The points x, v1, 7, m 
are collinear, as are also the points x_;, y_1, ¢, ¢1. The cross ratio* of the 
four points x1, yi, ¢, <1 is —bn,/(nK), and that of the points x, yi, 7, m is 
—am,,/(mH). Hence we have the following theorem: 


If two surfaces S,, S, are projectively associate in the modified sense, the 
cross ratio of the four points x1, V1, n, m. 1s equal to that of the points x1, yu, ¢, 
¢1, and the common value may be written 2ab/H. 


5. A CANONICAL FORM WITH PARTICULAR PARAMETRIC CURVES 


In this section we place as fundamental the well known system of differ- 
ential equations used in the study of the configuration composed of two sur- 
faces S,, S, in ordinary projective space S;, with their points in a one-to-one 


* Lane, op. cit., p. 214, exercise 11. 


834 
W 


1934] ASSOCIATE SURFACES 835 


correspondence. We then reduce this system of equations to a canonical form 
so that every pair of integral surfaces is projectively associate in the modified 
sense, the surface S, being referred to its asymptotic net as parametric, and 
the curves on S,, corresponding to the asymptotic curves on S,, forming the 
parametric conjugate net. 


Let 
x = x(u, 2), y = y(u, 2) 


be the parametric vector equations of two surfaces S,, S, in ordinary projec- 
tive space. If these surfaces have their points in a one-to-one correspondence, 
such that corresponding points P,, P, have the same curvilinear coordinates 
u, v, and such that each point P, does not lie in the tangent plane of S, at the 
corresponding point P,, then S,, S,are a pair of integral surfaces of a system 
of differential equations* of the form 
Tun = px + ax, + Bx, + Ly, 
Xuv = cx + axy + bx, + My, 
ov = Qu t+ yXu + 6x, + Ny, 
Vu = fx + max, + sx, + Ay, 
Vo = gx tix, + nx, + By. 
The integrability conditions for this system are given by the following equa- 


tions: 
a.+ab+c+mM =a,4+ “LL, 


+ + aB + sM = + ba + p+ aL, 
Cu tbe +ap+fM = p, + ca+ gl, 
My, +aL+(b+A)M =L,+BL+oM+8N, 
tu tta+an+mB+ g=m,+ am-+ sy + iA, 
gu t+ gd, 
Buy t+iL+nM =A,+sN+mM 
and those obtainable therefrom by the substitution (3). 


The lines xy joining pairs of corresponding points P,, P, of the surfaces 
S., S, form a congruence, the developablest of which are given by 


(24) sdu2 — (m — n)dudv — t dv? = 0. 
The focal points of a line xy are the points 7, ¢ given by 
n= yt f= y+ hex, 


* Lane, op. cit., p. 183 et seq. 
{ Ibid., p. 181. 


« 


836 M. L. MacQUEEN {October 


where &;, k, are the roots of the equation 
(25) k?+ (m+ n)k + mn — st = 0. 


It is known that the asymptotic curves on S, are parametric in case 
L=N=0. Let us suppose from now on that this condition is satisfied, and in 
order that the developables of the congruence of lines xy be determinate and 
intersect S, in a conjugate net we shall suppose m =n, st¥0. 

It is not difficult to calculate the system of equations corresponding to 
(22) when the roles of x and y are interchanged. We shall compute all of the 
coefficients of such a system later, but at the moment the only coefficients 
that are needed are those corresponding to L, M, and N which are indicated 
by dashes and given* by the following formulas: 


= futnpt+cst+Acu + Bers, 
= fotontqs+Aca + Bes 
gut cnt ptt+ Bea + Acasa, 

= go, + Aces, 

wherein the coefficients c;; are defined by placing 
Cu = +f + na + as, Ci2 = Su + + bs, 
= my +g + nb + Coe = ty + ny + at, 
C31 = 2, + an + Sy, = Sot f+ bn sé, 
= Ny + On + =t, +g+an-+ fa, 


(26) 


and where 


AA = sg — nf, AB = tf — ng, A=n*?— st #0. 


The parametric curves on S, form a conjugate net NV, in case M=0. We 
shall suppose from now on that this condition is satisfied, and in order that 
S, may be non-developable we shall suppose LV 0. The developables of the 
congruence of lines xy intersect S, in a conjugate net in case 


(27) iL — sN = 0, 


a condition which we shall suppose from now on to be satisfied. 
It is possible to simplify system (22) still more by a transformation of 
the form 


* Lane, op. cit., p. 185. 


(28) x = NZ, y= py. 


1934] ASSOCIATE SURFACES 837 


The effect of this transformation on the coefficients f, g, A, B is found to be 
given by the formulas 


uf = Af + sr./d), uz = +A,/d), 


(29) 
A =A — B= B— 


The last of (23) shows that u can be chosen so that A = B=0. We shall sup- 
pose from now on that this choice has been made. A condition necessary and 
sufficient that \ can be chosen so that f= z=0 is 


(f/s)u (g/t).. 


By means of (23) and (26) this condition can be shown to be equivalent to 
(27). We shall suppose from now on that this choice of \ has been made. 

When f=g=A =B=0, the line h of intersection of the tangent planes at 
two corresponding points P,, P, of the surfaces S,, S, joins the points P,, P. 
defined by 

P= = + nx,/s 
(30) 

o= yp = + nx,/t, 
as is seen on inspecting the last two of equations (22). 

When 4, v vary, the line # generates a congruence pa, whose developables 
will now be determined. If, as the point P, describes a curve of the family 
dv—ddu =0 on the surface S,, the line # generates a developable of the con- 
gruence po, and if the point P; defined by 


=pt+ko (k scalar) 


is the corresponding focal point of the line h, then / is tangent to the locus 
of the point P;; consequently the derivative ¢’ may be expressed as a linear 
combination of p, o only. But by actual calculation it is found that ¢’ ap- 
pears as a linear combination of x, p, a, y. Setting equal to zero the coeffi- 
cients of x, y therein, we obtain conditions on the functions k, \ necessary 
and sufficient that the line 4 may generate a developable of the congruence po 
and have P; for focal point, namely, 


(cst + npt) + (cns + pst)k + (gst + nct)rX + (cst + ngs)kX = 0, 


(31) 
st + nsk + nit + stkrX = 0. 


Elimination of & and substitution of dv/du for \ give the differential equation 
of the developables of the congruence pa, namely, 


(32) pdu? — qdv? = 0. 


4 


838 M. L. MacQUEEN [October 


A necessary and sufficient condition that the developables of the con- 
gruence po be indeterminate is seen from (32) to be 


(33) p=q=0. 

We shall suppose from now on that this condition is satisfied. As a result of 
the conditions which we have thus far imposed we find from (26) that 

(34) L=c, M=an=0, N=d. 


In view of the previous assumptions it is therefore evident from (34) that 
m=n=0, c¥0. 

The most general transformation of the form (28) which leaves the form 
of system (22) invariant, has \ and uw constant. The only coefficients not ab- 
solutely invariant under such a transformation are s, ¢, M, for which we find 


(35) 5 = As/p, = M = uM/r. 


It is not difficult to show by means of the integrability conditions (23) 
that 


M./M = ¢,./c, M,/M = ¢./c, 

and from these results it is evident that 

(36) M = ke (k = const.). 
It is therefore possible by a suitable choice of \ and yw to make the constant 


appearing in (36) equal to unity. We thus reach the following conclusion. 


Any system (22) such that every pair of integral surfaces is projectively paral- 
lel in the modified sense can be reduced to the form 
= axy + Bx, 
= M(x + y) + axy + 
= + bx, 
= Vo = tty (stM 0). 


(37) 


The parametric net N, is the asymptotic net and the parametric net N, is con- 
jugate. 


The integrability conditions for system (37) are found to be 
au +ab+c=a,+ py, 
b, + b?+ aB+sM = B, + ba + fi, 
M, + 6M = aM, 


ty. + ta = sy, 
and the formulas obtainable from these by the substitution (3). 


(38) 


1934] ASSOCIATE SURFACES 839 


The system of equations corresponding to (37) when the roles of x and y 
are interchanged is found to be 


Yuu = L(x + y) + ayn + Byv, 
Jun = + by», 
You = N(x + y) + + 


= ¢S, =b+s,/s, 
= Bt/s, = ¥s/t, 

ct, = bt/s, 
= 1/8, é = 1/s. 


(40) 


It is evident that the asymptotic curves which are parametric on S, corre- 
spond to the curves of the parametric conjugate net on S,. The asymptotic 
curves on S, are given by 


(41) Ldu? + Ndv? = 0. 
By use of (40) this equation becomes 
sdu? + tdv? = 0 


which defines the associate conjugate net of the net in which the develop- 
ables of the congruence of lines xy intersect S,. Thus since the asymptotic 
curves on each of the two surfaces S,, S, correspond to a conjugate net on the 
other, we therefore reach the following conclusion: 


Any system (22) such that every pair of integral surfaces is projectively as- 
sociate in the modified sense, can be reduced to the form (37). 


The last two equations of (37) show that the tangent to an asymptotic 
u-curve (v-curve) through P, on S, intersects the tangent to the v-curve 
(u-curve) of the parametric conjugate net through P, on S, in a point which 
lies in a fixed plane. Since statements similar to the preceding can be made 
by choosing a conjugate net as parametric on S, and the asymptotic net as 
parametric on S,, we therefore have a projective generalization of the theo- 
rem* of Eisenhart: 


* Eisenhart, op. cit., p. 381. 


where 
L = as/t, 
a 

NV 5=a+t+t,/t, 


840 M. L. MacQUEEN 


The tangents to the asymptotic curves on one of two projectively associate sur- 
faces meet the tangents to the curves conjugate to the corresponding curves on the 
other surface in points of a fixed plane. 


Inspection of (25) now shows that the two focal points P,, P; of a line xy 
are given by 
(42) n= yt = y — (st)*/2x. 


The cross ratio of the points P,, P, and the two focal points of the generator 
of the conjugate congruence is given by 


(x, y, 0, £) = (%, 0, (st), — (st)”?) = — 1. 
Corresponding points P,, P, of two projectively associate surfaces S,, Sy are 


separated harmonically by the focal points of the line joining them. 


If local coordinates x, - - - , x, based on the tetrahedron x, xu, x», y with 
suitably chosen unit point are introduced, the first and second focal planes 
of a line xy are found to have the local equations 


(t/s)*/2x3 0, Xe + (t/s)*/2x3 = 0. 


Therefore the planes x.=0, x3=0 containing a line xy and the asymptotic tan- 
gents through P, on S, separate the first and second focal planes of the line xy 


harmonically. 
The developables of the congruence of lines xy intersect S, in a conjugate 
net whose differential equation may be written 


s du? — tdv? = 0. 
Similarly, the developables intersect S, in the conjugate net given by 

du? — = 0. 
When reference is made to (40) it is evident that these curves on S, and S, 
correspond. 


SOUTHWESTERN COLLEGE, 
MeEmpPHis, TENN. 


ON THE POWER SERIES FOR 
ELLIPTIC FUNCTIONS* 


BY 
E. T. BELL 


1. Introduction. A more direct and more practicable method than those 
hitherto used for obtaining the coefficients in the power series expansions of 
doubly periodic and other elliptic theta quotients appears incidentally in 
some work relating to representations of rational integers as sums of integer 
squares. References to the literature will be found in the paper of Gruderf, 
and in the treatises of Enneperf{ and Krause.§ Most of the complications in 
some other methods, for example that of D. André||, enter with the use of the 
differential equation (or the equivalent difference equation, obtained by 
equating coefficients) satisfied by the function to be expanded. By avoiding 
the use of the differential equation entirely, the arithmetical nature of the 
coefficients in the power series becomes evident, and much tedious algebra 
is obviated. In the method used here the difference equations for the coeffi- 
cients are linear; other methods introduce non-linear equations. 

Hermite] proposed an extremely ingenious and elegant method, based on 
the transformation of the second order, for obtaining the coefficients when 
the functions are doubly periodic. Later** he remarked that this method is 
incapable, apparently, of leading to the desired end. However, Gruder (loc. 
cit., pp. 158-166) succeeded in obtaining explicit formulas for certain coef- 
ficients by this method, although the arithmetical character of the coeffi- 
cients is perhaps not as evident as it might be. 

Hermite** published only specimens of the results furnished by another 
method, without indicating what this method was. As noted by Picardff, 
some of Hermite’s explicit formulas thus obtained are incorrect (owing to 


* Presented to the Society, September 7, 1934; received by the editors April 30, 1934. 

t O. Gruder, Wiener Sitzungsberichte, IT a, vol. 126 (1917), pp. 125-183. 

t A. Enneper, Elliptische Functionen, 1890, §47. 

§ M. Krause, Theorie der doppeltperiodischen Functionen, 1895, §43. 

|| D. André, Annales de I’Ecole Normale Supérieure, (2), vol. 6 (1877), pp. 265-328. Ibid., vol. 
8 (1879), pp. 151-168; vol. 9 (1880), pp. 107-118. 

{ Ch. Hermite, Comptes Rendus (Paris), vol. 57 (1863), pp. 613-618; Liouville’s Journal, (2), 
vol. 9 (1864), pp. 289-295. 

** Ch. Hermite, Lettre d M. Kénigsberger, Crelle’s Journal, vol. 81 (1876), pp. 220-228. Oeuvres, 
vol. 3, pp. 236-245. See also Oeuvres, vol. 3, pp. 222-231. 

tt Hermite, Oeuvres, vol. 3, p. 237. 


841 


842 E. T. BELL [October 


slips in calculation). The incorrect formulas have been reproduced in the 
treatises by Enneper and Krause cited above; they may be easily corrected 
by the present method. The expansion of cn x being one of those containing 
errors, we shall consider it first in detail as an illustration of the general 
method, which consists of comparing the MacLaurin and Fourier expansions of 
the function whose power series expansion is sought. If the origin is a singularity 
of the function, the singularity is removed by any of the familiar devices 
used in obtaining the Fourier expansion; the procedure will be clear from the 
examples in §§4, 5. All of the series in the sequel are absolutely convergent 
for values of the variables different from zero. 

2. Expansion of cn x. It is readily seen that the expansion is of the form 

x? 


(1) cnx =1+ 1)*0,(k?) Gal’ 


s—1 


(2) Q.(k*) = ge(s)k*, 


where the g,(s) are integers. The problem of expanding cn x is thus reduced 
to that of calculating g,(s) as a function of r, s. 

Replacing cn x by its equivalent theta quotient, and expanding the latter 
in a cosine series,* 


2 = 
= 1| 7) cos tx], 


m=1, 3,5, - ;m=tr(t, r integers >0), (—1|r) In the last we 
now expand the cosines, rearrange the result (as is obviously permissible) as 
a power series in x, apply (1) to the left of (3), and finally equate coefficients 
of x**. Thus 


(4) = 


where £2,(m) denotes the sum of the (2s)th powers of all those (positive) divisors of 
m whose conjugate divisors are of the form 4h+-1 minus the like sum in which the 
conjugates are of the form 4h+-3. In (4) we apply (2), replace g by gq‘ in the re- 
sult, and get 


(3) 82 cn (xd?) = 


a—1 


(5) = 4 
r=0 


* This series, with others of a similar kind in later sections, is given with many more in my paper, 
Messenger of Mathematics, vol. 54 (1924), pp. 116-176. The recurrence (6) occurs incidentally in 
my paper on sums of squares, Bulletir of the American Mathematical Society, vol. 26 (1919), pp. 
19-25. 


1934] POWER SERIES FOR ELLIPTIC FUNCTIONS 843 


Let N(n, f, g) denote the number of those representations of n as a sum of f 
squares, precisely g of which are odd with roots greater than zero, and occupy 
the first g places in the representations, and f —g are even with roots greater than, 
equal to, or less than zero. Then, from (5) and the definition of V, we have the 
following recurrence for the q,(s): 

(m—1)/2 


(6) >> (2m, 4s + 2, 4r + 2)q,(s) = 


r=0 


To calculate g,(s) we take m=1, 3,5, - - - , 27+1 in (6) and solve the re- 
sulting linear equations for q;(s). Am explicit determinant formula for the gen- 
eral coefficient q;(s) is thus obtained, but it is more practical to proceed step 
by step, evaluating the numbers W as they occur. To illustrate the process, 
we shall calculate go(s), g:(s), g2(s), gs(s), the first three of which were given 
correctly, and the fourth incorrectly, by Hermite. 


m =1: N(2, 4s + 2, 2)qo(s) = 


Referring to the definition of NV, we see that 2=1?+1?+4s-0? is the only 
representation enumerated by N(2, 4s+2, 2). By the definition of &, &,(1) =1. 
Hence go(s) =1. 


m = 3: N(6, 4s + 2, 2)go(s) + N(6, 4s + 2, 6)24g:(s) = Exe(3); 
6 = 17+ 17+ [27+ (4s — 1)0?], 
2(4s)! 
11(4s — 1)! 
24g,(s) = 3% — 8s — 1. 


N(6, 4s + 2, 2) = = 8s; £,(3) = 3% — 1; 


10 = 12+ 12+ [22 + 22 + (4s — 2)0?] = 12 + 3% + [45(0%)], 


N(10, 4s + 2, 2) = me + 2 = 2(16s? — 4s + 1); 
21(4s — 2)! 
10 = 6-12 + [22 + (4s — 5)0?*], 
N(10, 4s + 2, 6) = fh. 8(s — 1); 
11(4s — 5)! 
N(10, 4s + 2, 10) = 1; £2(5) = 5% +1; 
(32s? — 8s + 2)go(s) + 8(s — 1)2*gi(s) + 2%g2(s) = 5* + 1; 


2°qo(s) = 5** — 8(s — 1)27* + 32s? — 48s — 9. 


m= 5: 


E. T. BELL [October 


14 = 2-12 + [3-22 + (4s — 3)0?] = 12 + 32 + [22 + (4s — 1)0*], 


N(14, 4s + 2, 2) = 5 


16 
= al — 12s + 5); 


14 = 6-12 + [2-22 + (4s — 6)0?] = 32+ 5-1%, 
N(14, 4s + 2, 6) = th + 6 = 2(16s? — 36s + 23); 
21(4s — 6)! 
14 = 10-12 + [2? + (4s — 9)0?], 
N(14, 4s + 2, 10) = ew 8(s — 2); 
1!(4s — 9)! 
N(14, 4s + 2,14) = 1; &(7) = 7% —1; 
2'2g3(s) = — 8(s — 2)57* + 2(16s? — 60s + 41)37* 
— 3(256s* — 1248s? + 1280s — 297). 
For s=1, 2, 3, ~- - these values check with the numerical results given 
in the treatises. By the transformation of the first order, 


sn (ku, 1/k) =k sn (x, k); 


whence q;(s) =q._;(s). 
From (6) and the definition of £,(m) it is evident that 
(7) 2449;(s) = (27 + 1)* + Ai(s)(27 — 1)* + Aa(s)(27 — 3)* + --- + 


where the A’s are polynomials in s with rational coefficients. It will be shown 
that the degree in s of A,(s) isr(r=1, - - - , 7). The last is an immediate con- 
sequence of (6) and the following lemma. 
The degree in s of N(2m, 4s+-2, 2m—4h) is h (h=0, 1, - - - , (m—1)/2). 
Before proving the lemma we shall examine it for h=0, 1, 2, 3. Obviously 
N(2m, 4s+-2, 2m) =1. For h=1 we have, as the only possible decomposition 
of 2m of the kind enumerated by V(2m, 4s+2, 2m—4), 


2m = (2m — 4)12 + [22+ {(4s + 2) — (2m — 3)}0°]; 
hence, enumerating the corresponding representations, we get 
2-(4s — 2m + 6)! 


N(2m, 4s + 2, 2m — 4) = = 4(2s — m + 3). 
— 2m + 5)! ( 


844 
m=7: 


1934] POWER SERIES FOR ELLIPTIC FUNCTIONS 845 


Here a negative result (m>2s+3) is to be interpreted as zero (no representa- 
tions), and likewise in all similar cases. When h=2, of the 2m—8 odd squares 
in the representations enumerated by N(2m, 4s+2, 2m—8) all may be 1’s or 
precisely one may be 3?, and there are no other possibilities. Hence the only 
decompositions to be considered are 


2m = (2m — 8)12 + [2-22 + (4s — 2m + 8)0?] = (2m — 9)1?2 + 32; 


whence, counting the representations of the kind enumerated by NV we have 


_ 2?-(4s — 2m + 10)! (2m — 8)! 
2'(2s — 2m +8)! | 11(2m—9)!- 


N(2m, 4s + 2, 2m — 8) = 


the last fraction corresponding to the representations obtained by arranging 
the (2m—9) 1’s and the 3? in all possible ways. Thus 
N(2m, 4s + 2, 2m — 8) = 2[16s? — 4(3m — 14) + 2m? — 18m + 41]. 
The only decompositions of 2m to be considered when 4=3 are 
2m = (2m — 12)12 + [3-22 + (4s — 2m + 11)0?], 
= (2m — 13)12 + 3% + [22 + (4s — 2m + 13)0?]; 
whence 
23(4s — 2m + 14)! 
31(4s — 2m + 11)! 
(2m — 12)! 2-(4s — 2m + 14)! 
11(2m — 13)! 11(4s — 2m + 13)! 


N(2m, 4s + 2, 2m — 12) = 


which is of degree 3 in s. 
To prove the lemma, consider the decomposition 


2m = (2m — 4h)1? + [h-2? + (4s + 2 — 2m + 3h)0°], 
which contributes to V(2m, 4s+2, 2m—4h) precisely 
2. (t + 4h)! 
t=4s+2-2 


representations. If 4-2? has a decomposition into a sum of ‘+44 even squares 
other than that in [ ] above, it is of the form indicated in [ ] in the following 
decomposition of 2m, 


2m = (2m — 4h)1? + + 


where h;, a; (t=1,---, p) are >0, a;=2, the a, ---, a, are distinct, and 


846 E. T. BELL [October 
+ This decomposition contributes to V(2m, 4s+2, 
2m—4h) precisely 
4h)! 
+ 4h — hy —---— 


A'(t) = 


representations. The degree in /, and hence also in s, of the polynomial A (#) 
is h; that of A’(t) is ii+ - - - +h,. Hence, when it is shown that 4+ - - -+h, 
<h, the lemma will be proved. The required inequality is obviously implied 
by the following more general situation, which is of use in other questions 
of this kind. Both are practically obvious, but we give a formal proof. 

If p>1, and h, x, hi, x; (i=1,---, p) are any integers >0 such that 
(A) (¢=1,---,p);(B) at +xp>pa;(C) +hyry; 
then --- +hyp. 

To prove this, define the e; by x;/x=1+e;. Then, from (A), e;20; from 
(B) e:+ - - - +e,>0; hence at least one of e:, - - - , ep is >0. From (C), 


hy hy hy hy 
1 +e); 


the second () is >0; hence 


hy 


h 


The inequality is also easily seen from a simple contradiction. This completes 
the proof of the lemma. 
3. Expansions of sn x, dn x. Proceeding as before from 
(— 


0? sn (xd?) = = sin tx] 


(m = 1, >0,7 > 0), 


we get the recurrence (8) for the p,(s): 


(8) (2m, 4s + 4, 4r + 2) = 


r=0 


where {2,;:(m) denotes the sum of the (2s+1)th powers of the divisors of m. 
To illustrate the calculations, let m=1, 3,5, 7. Then 


po(s)N(2, 4s + 4, 2) §2041(1); po(s) = 1. 


1934] POWER SERIES FOR ELLIPTIC FUNCTIONS 


For m=3, 5, 7 we need the following N’s: 
N(2m, 4s + 4, 2m) = 1; 
2(4s + 2)! 
N(6, 4s + 4, 2) = ————— = 8s +4; 
27(4s + 2)! 
21(4s)! 
2(4s — 2)! 
1-2%(4s + 2)! (4s + 2)! 
31(4s — 1)! 11(4s + 1)!” 


N(10, 4s + 4,2) = 2+ = 2(16s? + 12s + 3); 


N(14, 4s + 4, 2) = 


8 
~~ + 1)(16s? + 4s + 3); 


2?-(4s — 2)! 
21(4s — 4)! 
2(4s — 6)! 


N(14, 4s + 4, 10) = —————- = 42s — 3). 
11(4s — 7)! 


N(14, 4s + 4, 6) = + 6 = 32s? — 40s + 18; 


It will be sufficient to indicate the origin of one of these, say N(14, 4s+-4, 2): 
14 = 2-12 + [3-22 + (4s — 1)0?] = 12 + 32 + [22 + (4s + 1)0?*]. 
Substituting these values in (8), and using 2.41(3) =3%+!+-1, etc., we find 
po(s) = 1; 2*pi(s) = — 8s — 3; 
28 po(s) = — 4(2s — 1)3%+! + 3252 — 32s — 17; 
2!2h3(s) = 72#+1 — 4(2s — 3)5%+! 4+ (325? — 88s + 30)32+! 
— 3(256s? — 1056s? + 752s + 471), 


agreeing with the values stated by Hermite. As in §2, it can be shown that 
the general form is 


(9) p(s) = (2j + + By(s)(2j — + 


where B,(s) is a polynomial in s of degree r with rational coefficients. From 

the MacLaurin expansion the p,(s) are integers. An explicit (determinant) 

form follows from (8). The relation sn (kx, 1/k) =k sn (x, k) gives p;(s) = p._;(s)- 
For dn x we have 


847 


E. T. BELL [October 


(2s)! 


» = 


dnx=1 


dn (x3?) = =1+ 1| r) cos 22) 


(n = 1,2,3,---;" odd,t >0,7>0); 


n-1 
(10) 4s + 2, 47 + 4)rj(s) = 2% -*2,(n), 


i=0 
where ¢ is as defined in §2. To calculate the successive r,(s), or to exhibit a 
determinant for 7;(s), we take »=1, 2, 3, - - - , and proceed as before. Thus 


ro(s) = 278-2; = 27#-8(27* — 8s + 4); 

ro(s) = 2*-10[32* — 4(2s — 3)2%* + 32s? — 88s + 31]. 
The general form is 
(11) r4(s) = + 1) + C,(s)j* + Cr(s)G — 1)* +CAs)1°], 
where C,(s) is a polynomial of degree in s with rational coefficients. The 
relation dn (ku, 1/k) =cn (u, k) gives r;(s) =q.1-;(s) (g as in §2); but this 


does not enable us to calculate the general r;(s) (j=1, 2, - - - ) successively 
from the q;(s). 

4. Reciprocal of sn x. This will illustrate expansions in which the origin 
is a simple pole, and in which it is necessary to use the Bernoulli or Euler 
numbers to obtain the coefficients. From our paper already cited,* we have 


xd? Bo(x) 
sn (x3?) 

(n = 1,2,3,---;" = t,t > 0,7 > 0, odd), 


= xese x + 4x) sin rx] 


the multiplier x being introduced to render the series regular at the origin. 
The form of the MacLaurin series is easily seen from the indicated division 
of x by the power series for sn x in §3, and we have 


28 
sn x ei (2s — 1)! a 
To expand «x csc x we shall use the numbers R of Lucasf defined by the 
symbolic identity 


* Messenger of Mathematics, vol. 54 (1924), pp. 116-176, §14, p. 172. 
t E. Lucas, Théorie des Nombres, chapter xiv. 


848 
= 


POWER SERIES FOR ELLIPTIC FUNCTIONS 
hed —1 
s=0 (2s)! 
In terms of the Bernoulli numbers B, in the even-suffix notation, 


1 1 
xctnx=cos2Bx, Bo=1, B=— 
2 6 30 


we have 
1 
Ry, = (1 27*-1) Ro = = 
Proceeding as before we find 


= (— Ra | 
n=l 


where {3,-:(") denotes the sum of the (2s—1)th powers of all the odd (positive) 
divisors of n. The left member is 


s| + |, 
real 


ds =1+ 45), 


where N(n, 4s) denotes the total number of representations of n as a sum of 4s 
squares. Hence 


(12) sho(s) = (— 1)*Re, 
ho(s)N(n, 4s) + 1)"h,(s)24*N(4n, 4s, 4r) = — 4(— 


from which the successive h;(s) can be calculated as in previous examples, 


and the general form is easily determinable. 
5. Expansion of g(x). This is referred to the expansion of x?/sn? x by 


means of 
x? 1 + k? 


x2, 
sn? (x, k) 3 


in the customary notation. As Gruder (loc. cit., §13) has shown the connec- 
tion between the coefficients in the polynomials (in ge, gs, or in the absolute 
invariant g/g?) occurring as coefficients in the power series for (x; gs, gs) 


1 7 
—-—, 
6 30 
and 
r=1 


850 E. T. BELL [October 


and the coefficients in the polynomials (in k?) occurring as coefficients in the 
expansion of x?/sn? (x, ), it will suffice here to give the recurrence for the 


latter. 
From the expansion of x/sn x it is easily seen that the MacLaurin series 


is of the form 


x? 2s 


= Talk) = Se = 1; 
sn? x e=0 (2s)! r=0 


and from the author’s paper cited above,* we have 
sn? (a3?) 8? (x) 
= [4 + csc? x — cos 2dx) | 
(n = 1, 2, 3,--- ;n = di,d >0,5 > 0), 

where o:(n) ={:(n) +¢/ (n), £, ¢’ being as defined in §§3, 4. To expand x? csc x? 
we may use the Bernoulli numbers of the second order,{ or proceed as follows 
to obtain the coefficients at once in terms of ordinary Bernoulli numbers. 
The symbolic identity defining the Bernoulli numbers B is x ctn x=cos 2Bx. 
Differentiating this with respect to x and multiplying the result throughout 


by x, we get 
x? csc? x = cos 2Bx + 2Bx sin 2Bx. 


Hence, equating coefficients of like powers of x, we have 
x? csc? x = cos Dx, Do, = 22*(1 — 2n)Bo, (m = 0,1,---). 
The rest of the work is like that in preceding sections, and we get (from 


the coefficients of x”, x*, s>1, in the identity between power series in x) the 
preliminary results 


d#T2(k*) = — 2D — Da, 
= (— 1)*[Dee + 165(2s — 1) 
for s>1, the summations referring to m=1, 2, 3, - - - . The first of these gives 
to(1)ds + = — 23 — De; 
and there are the known expansions 
dst = 1+ 1)"Ar(n), = 16 (m) 
(n = 1,2,3,---;m= 1, 3, 5,--+)s, 


* Messenger of Mathematics, vol. 54 (1924), pp. 116-176, §16, p. 173. 
+ N.E. Nérlund, Differenzenrechnung, 1924, p. 129, et seq. The symbolic processes used here are 
justified (among other places) in my Algebraic Arithmetic, 1927. 


1934] POWER SERIES FOR ELLIPTIC FUNCTIONS 


where 
Au(m) = [1 + 2(— 
From the definitions of the functions it is easily seen that 
— = 2D (n). 
Hence, finally, we get 
(13) to(1) = = §. 


All this detail for ¢o(1), #:(1) is of course unnecessary, as 72(k?) is readily seen 
to be 3(1+£?); but the reduction provides a check on the expansions. 
Reducing the second of the above preliminary results as before we find 


(14) to(s) = (— = (— — 25) 


(15) to(s)N(m, 4s) + (4m, 4s, 4r) = 0 


(m 


to(s)N(2n, 4s) + yo t,(s)N(8n, 4s, 4r) 


r=] 


(16) 
= (- 1)*s(2s (n 1, 2, 3, ), 


all of which hold only for s>1. The functions N, ¢ are as previously defined. 
From these the structure of ¢;(s) is seen as before (the few specimens given 
by Hermite, Oeuvres, vol. 3, p. 239, in another notation, do not indicate that 
the (2s—1)th powers of integers >1 enter the #,(s) for r>2). Taking m=1 
in (15) we get (s>1) 


ti(s) = (— 1)*s(2s — 
and ”=1 in (16), 
to(s) = (— 1)*s(2s — 1)22*-*[1 — 2(4s — 7) Bal, 


which check with tabulated results for s =2, 3, 4, 5. 

6. Further developments. Hermite (Oeuvres, vol. 3, p. 245) was inter- 
ested in these expansions partly on account of their possible applications to 
Gyldén’s methods (followed by Brendel) in the computation of perturbations, 
particularly for the so-called critical planets, whose mean motion is almost 
commensurable with Jupiter’s. In this connection the expansions of power- 
products of sn x, cn x, dn x are required, the powers being positive or nega- 
tive. From the series for sn x, cn x, dn x and their reciprocals, the general 
k?-polynomial form of the coefficient of x* in the expansion of sn* x cn’ x dn* x, 


851 
r=1 
= 1, 3,5,---); 


852 E. T. BELL 


where a, b, c are integers, can be inferred. The general trigonometric series 
for use in the present method were investigated by Meyer,* from whose gen- 
eral results the types of arithmetical functions appearing in the coefficients 
can be determined. As this is quite an extensive subject we shall not go into 
it here, except to note a necessary change which occurs in the arithmetical 
character of the coefficients when any one of a, b, c passes the value 2: the 
functions are no longer expressible in terms of the divisors of a single integer 
(as they are for all the expansions in the present paper), but refer to represen- 
tations in quadratic forms other than xy (which introduces the functions of 
divisors). For example, one function is }-(xyzw)*, the sum being taken over 
all representations of a fixed integer in the form x?+y?+2?+w?. This is 
analogous to the similar situation concerning the number of representations 
of an integer as a sum of 2s squares when s >4, where we have the classical 
theorems for s =5, 6 which introduce quadratic forms other than xy. 


* C. O. Meyer, Crelle’s Journal, vol. 37 (1848), pp. 273-304. 

t For the following references to the astronomical applications, I am indebted to Professors 
A. O. Leuschner and R. H. Sciobereti. 

(1) M. Brendel, Abhandlungen der K6niglichen Gesellschaft der Wissenschaften zu Gottingen, 
vol. 1, No. 2 (1898), part 1, pp. 45-51 and pp. 53-55; vol. 6, No. 4 (1909), part 2, chapter 2, p. 12. 

(2) H. Gyldén, Studien auf dem Gebiete der Stérungstheorie, Academy of St. Petersburg, Memoirs, 
(7), vol. 16. : 

(3) F. Tisserand, Mécanique Céleste, vol. I, Chapter XVII: Sur certaines fonctions des grands 
axes qui se présentent dans le développement de la fonction perturbatrice. 

(4) H. Poincaré, Les Méthodes Nouvelles de la Mécanique Céleste. Poincaré’s summary of Gyldén’s 
method is in vol. 2, p. 202, et seq., more particularly pp. 247-251-253. 

(5) H. Gyldén, Traité des Orbites Absolues des 8 Planétes Principales, 1893. vol. I, book II, 
chapter IT; vol. I, book III, chapter II, p. 357, p. 394. Brendel’s work on this particular subject is 
merely a reproduction of Gyldén’s treatment of the perturbative function. 


CALIFORNIA INSTITUTE OF TECHNOLOGY, 
PASADENA, CALIF. 


SOME INEQUALITIES FOR NON-UNIFORMLY BOUNDED 
ORTHO-NORMAL POLYNOMIALS* 


BY 
M. F. ROSSKOPF 


1. Introduction. Let the set {¢,(x) } be an ortho-normal set of functions 
on the interval (a, 6) and let M be a constant such that 


\on(x)| SM (n=0,1,2,---;aS5 5); 


then the Fourier expansion of any function f(x) in terms of these functions 
is 


f(x) ~ where c, = f 
n=0 a 


For sets of functions which satisfy the above assumptions the following two 
theorems of F. Rieszf are well known. 


Tueorem A. Let the set {¢n(x)} of ortho-normal functions defined on the 
interval (a, b) satisfy the condition 
| 


and let f(x) (1<p<2). Then 


where p’ is determined by the relation 1/p+1/p' =1. 


THEoREM B. [f the series >-|cn|? is convergent, then the constants cp are 
the Fourier coefficients of a function f(x) ¢ Ly (p’ =2), relative to a set of uni- 
formly bounded ortho-normal functions; and moreover 


where p and p’ satisfy the relation 1/p+1/p’ =1. 


* Presented to the Society, December 27, 1933; received by the editors May 8, 1934. The author 
is indebted to Professor J. D. Tamarkin for suggestions and criticisms. 

t F. Riesz, Uber eine Verallgemeinerung der Parsevalschen Formel, Mathematische Zeitschrift, 
vol. 18 (1923), pp. 117-124. For the case of trigonometric series see F. Hausdorff, Eine Ausdehnung 
des Parsevalschen Satzes tiber Fourierreihen, Mathematische Zeitschrift, vol. 16 (1923), pp. 163-169. 
In this paper is also given a list of W. H. Young’s papers on the subject. 


853 


854 M. F. ROSSKOPF [October 


As was called to my attention by Professors Hille and Tamarkin, in the 
case of the expansion of the function 


(1) a) = 


in normalized Legendre polynomials, F. Riesz’s theorems do not hold. 
Stieltjes} considered this function and showed that for the convergence of 
its Legendre series, besides assuming —1<x<1, it is necessary to take 
a<#; from the asymptotic value of the coefficients this is easily seen, since 


T(a)l'(n — + 2) T(a) 


The function (1) belongs to L, for every p<1/a, whereas the series >>| cn| >’ 
diverges whenever a= }; thus it is seen F. Riesz’s first theorem does not apply 
to Legendre series. 

The problem now is to modify the inequalities which appear in Theorems 
A and B so that the Legendre coefficients of a certain class of functions would 
satisfy a new inequality. In particular it is desirable to obtain an inequality 
which would take care of this function of Stieltjes. In the first part of the 
present paper this problem is solved not only for the case of normalized 
Legendre, Jacobi and Hermite polynomials but also for a general class of 
ortho-normal polynomials possessing certain properties. 

The end of the paper contains theorems for our general class of ortho- 
normal polynomials, which were suggested by a publication of R. E. A. C. 
Paleyt in which he extended some results of Hardy and Littlewood§ from 
Fourier series to the case of a set of uniformly bounded ortho-normal func- 
tions. The following theorem is typical of Paley’s results. 


3/2. 


Cn 


THEOREM C. Let Co, ¢1, C2, - - - denote a bounded set of numbers such that 
Cr—0 as n—@ , and let 


co 


denote the set |co|, |c:|, |co|, - - - rearranged in descending order of magnitude. 
If the series >\c,*?'n»’-? converges, where p'=2, and if the ortho-normal set 
{0,(x)} satisfies the condition 


t Correspondence d’Hermite et Stieltjes, Paris, Gauthier-Villars, 1905, vol. 2, letter 249, p. 46. 

t R.E. A.C. Paley, Some theorems on orthogonal functions (1), Studia Mathematica, vol. 3 (1931), 
pp. 226-238. 

§ G. H. Hardy and J. E. Littlewood, Some new properties of Fourier constants, Mathematische 
Annalen, vol. 97 (1926), pp. 159-209. Notes on the theory of series (XIII): Some new properties of 
Fourier constants, Journal of the London Mathematical Society, vol. 6 (1931), pp. 3-9. 


1934] ORTHO-NORMAL POLYNOMIALS 855 


then the function f(x) ~>-cnOn(x) is of class Ly and 


1 
(2) f | f(t) S Ay + 1)? 
0 


n=0 


where A,» depends only on p' and M. 


As in the case of Theorems A and B modifications must be made in the 
inequality (2) in order to arrive at the theorems for our general class of ortho- 
normal polynomials. 

2. Lemmas of M. Riesz. In this section we state two theorems of M. 
Rieszt which will be of fundamental importance in the proofs of our theorems. 
For convenience of reference we shall designate them as Lemmas 1 and 2. 
First we define a certain class of functions; the function f(x) will be said to 
belong to the class L,* if the following Lebesgue-Stieltjes integrals of f(x) 
with respect to the non-decreasing function ¢(x), defined on the interval 
(a, 6), exist and are finite: 


b 
f soa, (2 1). 
Similarly we can define the class of functions L,* (c=1), corresponding to 


the non-decreasing function ¥(x) defined on the interval (a’, 5’). 


Lema 1. Let T=T(f) be a linear limited functional transformation of cer- 
tain classes L,* into certain corresponding classes L,°; 1.¢., 
(1) the transformation is distributive, so that for arbitrary constants \i, da, 


T(Aaft + = ArT (fr) + (fs); 


(2) there exists a constant M* such that 


(f f 


Denote by M*(a, y) the least upper bound of the ratio 


(f | WAG) Islas)” 


for every couple of exponents a and c, where aa=cy =1. If the relation between 
a and c is such that one always has c2a, and if the point (a, y) describes a 
straight line segment in the triangle 0 Sy Sa X11, then log M*(a, y) is a convex 
function of the points of the line segment. 


+ M. Riesz, Sur les maxima des formes bilinéaires et sur les fonctionnelles linéaires, Acta Mathe- 
matica, vol. 49 (1926), pp. 465-497. In particular see Theorems V and VI. 


856 M. F. ROSSKOPF [October 


Lemma 2. Every time that one has a linear limited functional transformation 
of into L,% and of L4% into Ly, with Co=d2, the transformation can 
be extended to every couple of exponents corresponding to the points (a, y) of the 
line segment joining the points (a, and (a2, 2). 


3. Notation and definitions. Let 
Ao(x), A,(x), A,(x), 


be polynomials which are of exactly the mth degree for each value of n; let 
p(x) be a non-negative weight function, integrable and not identically equal 
to zero in the interval (a, 5). This set of polynomials will be said to be orthog- 
onal if 


f m(t)dt = (n ~ m) 


and normal if 


f (t)dt = 1. 


If the Fourier coefficients of a function f(x) relative to these polynomials, 
= 
exist, the expansion of /(x) in terms of these polynomials is 


f(x) ~ 


n=0 
Let the function a(x) be absolutely continuous and such that 
a’(x) = B(x) 2 0; 
in addition let B(x) >0 except for a set of measure zero. Set 
(3) W(x) = [p(x)/B(x)]"* = 0; 


for convenience we shall write 


b 1/p’ 
Jf) =Jp = (f | ) Sp(f) = Sp = ( ‘ 
a n=0 
If J,?(f) exists, we shall write W(x)f(x) ¢ 
Throughout the paper we shall understand by p and p’ two numbers which 
satisfy the relations 1<p<2, p’=2, 1/p+1/p’=1; hence when p=1, the 


1934] ORTHO-NORMAL POLYNOMIALS 857 


corresponding value of p’ is ©. Furthermore, A will be used in the generic 
sense to denote a constant independent of m and x. 

We postulate the following properties of the set of polynomials {A,(x)}: 

(1) the A,(x) are ortho-normal in the above sense; 

(2) |W(x)A,(x)| SA, for all (n=0, 1, 2,---) and aS«Sb. 
Property (2) is also a condition for the function B(x) since it appears in 
W 

It is interesting to see how the function W(x) is introduced. Bessel’s in- 
equality for the polynomials A,(x) suggests putting 


= bas 


on the other hand, in order to use M. Riesz’s lemmas we must have 


b 
|woso = | wos 


Comparison of these two expressions for J,? leads to setting 


| We) |? = 


4. Generalizations of F. Riesz’s theorems. Having agreed upon the above 
notation and properties of our ortho-normal polynomials, we can prove the 
following theorems. 


THEOREM I. Jf 
(4) W(x) f(x) 
then 
Sy 
where A is a constant. 
Set 
(S) F(x) = W(x)f(x), 


and let ¢(x) =a(x), ¥(x) =[x], where the symbol [x] denotes the greatest 
integer in x. By definition the linear transformation T is 


b 


for all integral values n of x, and of arbitrary value for non-integral values 
of x. 


| 
| 
ik 
| 


858 M. F. ROSSKOPF [October 


In terms of this notation and the notation of Lemmas 1 and 2 what we 
wish to prove is that 


1 1 
(6) =) = sup (Sp /J p) 


To prove the theorem it will suffice to show that M*($, 3) and M*(1, 0) are 
bounded; then to interpolate for other values of 1/p and 1/)’ on the line 
segment joining the two points (3, 3) and (1, 0) by Lemma 2. The desired 
inequality will result from Lemma 1. 
Now M*(3, 3) <1, by Bessel’s inequality for our ortho-normal polynomi- 
als. For M*(1, 0) we writet 
ta] 


O<n< 


M*(1, 0) = sup r; 
By Property (2) we havef 
6 


hence we have M*(1, 0) SA. 

The line segment with end points (1, 0) and (3, 3) has the equation 
y =1—a, and lies in the triangle 0<y<a<1. Consequently Lemma 2 ap- 
plies and whenever F(x) ¢ L.?, the series >-|c,|?’ converges. The desired in- 
equality (6) results from the convexity of log M*(1/p, 1/p’), assured by 
Lemma 1. Indeed, one has§ 


p 


2 


(7) <A (2—p)/ Pp, 


TueEoreM II. If the series >>| converges, then the numbers c, are the 
Fourier coefficients of a function f(x) such that 


W(x) f(x) ¢ LP’, 
and 

Jp 
where A is a constant. 


t The usual convention is made here. When p’= «, in order to compute M*(1, 0), the numerator 
is replaced by max | cn over all values (n=0, 1, 2,- - - ). In the case of an integral appearing in the 
numerator, it is replaced by the upper measurable bound (in the sense of Lebesgue) of the integrand, 
which we shall designate simply as the maximum. Cf. M. Riesz, loc. cit., footnote 2, p. 477. 

¢ The author is indebted to the referee for a simplification of the argument at this point. 

§ For the origin of this inequality see M. Riesz, loc. cit., p. 484. 


1934] ORTHO-NORMAL POLYNOMIALS 859 


The notation of Lemmas 1 and 2 becomes (x) =a(x), ¢(x) = [x]. In the 
case of Theorem I it was f(x) which was varied but now the c, are the quanti- 
ties varied. By definition the transformation T will be such a transformation 
on the space of elements c=(co, C1, 5 Cny*** )y ?<0, which 
associates with c the Fourier expansion of the function [W(x) ]-F (x) which 
has the components of c for Fourier coefficients. We set 


T(c) = F(x) ~ 
n=0 


Now M*(3, 3) <1, by the Riesz-Fischer theorem. For M*(1, 0) we must 
write 


M*(1, 0) = sup 


but by Property (2) the numerator is bounded by AS; hence M*(1, 0) <A. 
Using Lemmas 1 and 2 and the inequality (7), the statement of the theorem 
follows. 

5. Jacobi polynomials. In order to show that Theorems I and II hold for 
normalized Jacobi polynomials we have only to prove that they possess 
Properties (1) and (2). 

The Jacobi polynomials P,@*) (x),a>—1,8>-—1, with the weight func- 
tion p(x) =(1—x)*(1+2)* and a= —1 and 6=1, are orthogonal in the sense 
of Property (1). In fact 

1 
0, 


T(n tat 1) 
then the set of polynomials 


(a ,B) 
(a ,B) P, (x) 


t G. Pélya and G. Szegi, Aufgaben und Lehrsdtze aus der Analysis, Berlin, Julius Springer, 1925, 
vol. II, pp. 93, 292. 


(n = 0,1,2,---), 


then 
max | F(x) | 
Si 
n = m; 


M. F. ROSSKOPF 


+1) 


k,(a, B) = 


is an ortho-normal set of polynomials. 

The function a(x) =arc sin x; this a(x) is obviously absolutely continuous 
in the interval (—1, 1), and the function B(x) = (1 —x?)-"/? is positive through- 
out this interval. The function W(x) takes the form 


W(x) = (1 + 
It can be easily verified by applying Stirling’s formula for I'(x) to the 
normalizing factor [k,(a, 8) that 


(a ,B) 2. (a.B) 


(8) On” (x) = O(n) 
Suppose a= —}, B= —}; then the following result of S. Bernstein* is just a 
statement of the fact that the Q,(x) possess Property (2). 

Lemma 3. Suppose that 


max (1 — 4 | P, | = M,(a, B). 


lim n'/2M,(a, B) = 2(¢+)/2M (a, B) 


1 
+ 


THEOREM I’. The result of Theorem I is valid in the case of normalized 
Jacobi polynomials when a>—1,8>—1, provided that f(x) satisfies the further 
condition (1—x)*(1+)®f(x) cL. 

*S. Bernstein, Sur les polynomes orthogonaux relatifs a un segment fini, Journal de Mathématiques, 
(9), vol. 9 (1930), pp. 127-177; vol. 10 (1931), pp. 219-286; see pp. 225-232. These results are proved 
in a very simple way by G. Szegi, Asymptotische Entwicklungen der Jacobischen Polynome, Schriften 
der Kénigsberger Gelehrten Gesellschaft, vol. 10 (1933), pp. 35-110, p. 79. 


860 pl [October 
where 
Then 
exists and 
(=) ~Sas—, 
2 1/2 1 1 
finite and >(=) ,iffa>— 
T 2 2 
M(a, 8) = 1 1 
or ifa2d 


1934] ORTHO-NORMAL POLYNOMIALS 861 


The analysis is the same as that already given except that in showing that 
|cn| is bounded independently of m a discussion of the case —1<a<—4, 
—1<8<-—} must be given. For this purpose we shall use the following 
bound for P, (x) found by Szego*, 

(9) | P,@-)(x)| < An™*=(a,-1/2) (—1+eS2<1), 
the classical inequality for P,‘«(x) due to Darbouxf, 
6\~ 
| P,,@ ®)(cos 6) | S An-°(sin sin 5) (cos—) 
(10) 


and the well known relation, 


(11) (— x) = (— 19 PY). 


Making use of (8), we write 


(a 


—1+6 1-8 1 


| —1+48 
=1,+12+ Is, 
where 0 <5<1. We have by (9) that 


(12) I; = O(1) (1 — + | de; 


1-8 


using (11) and then (9), we can write 


(13) I, = O(1) (1 — + f(t) | det. 
-1 
By the added condition on f(x) the integrals (12) and (13) tend to zero as 
5—0. There remains to be considered J,; in estimating it we make use of (10), 
1-6 


Iz = O(1) (1 — + | dt 


—1+8 
< (1 — 4)8/2-1/4| | dt; 


* G. Szegé, loc. cit., p. 77. 
t G. Darboux, Sur l’approximation des fonctions de trés grands nombres et sur une classe étendue 
de développements en série, Journal de Mathématiques, (3), vol. 4 (1878), pp. 5-56, 377-416; p. 50. 


| 
by 


862 M. F. ROSSKOPF [October 
hence 


1 
len] A] (1 — + | dt = 
-1 
This completes the proof that |c,| is bounded independently of m when 
—1<a<—}, —1<f<-—4; the proof in the case that a= —4, B= —} follows 
as before. 
Our proof depends fundamentally on the applicability of Lemmas 1 and 
2. The space of functions satisfying the conditions of Theorem I’ is a sub- 
space of the space of functions satisfying the conditions of Theorem I. Hence 
what we have assumed is that M. Riesz’s theorems hold in every sub-space 
of their space of definition. That this is true is apparent from the way in 
which M. Riesz derives his results. 
That a similar extension of Theorem II is possible is not at all obvious. 
The lemma of S. Bernstein would seem to preclude that. 
6. Legendre polynomials. The normalized Legendre polynomials defined 
on the interval (—1, 1), 


2n+1 1/2 2n+1 1/2 2m+1 1/2 1 


automatically possess Properties (1) and (2) since they correspond to the 
values «=8=0 of the parameters in Jacobi polynomials. The function a(x) 
has the same definition. 

It can be shown that the Theorems I and II for Legendre polynomials 
sift out the correct intervals of convergence and divergence of the sum and 
integral involved in the inequalities for the Stieltjes function (1). 

7. Hermite polynomials. The set of normalized Hermite polynomials, 


1 


with p(x)=e-*, a= — 0, b= 0, possesses Property (1); indeed* 


f eH, ()An(t)dt = { 


mn = m. 


If a(x) =x, it is easily seen that it is absolutely continuous and that B(x) >0; 
by use of the inequality 
* E. Hille, A class of reciprocal functions, Annals of Mathematics, (2), vol. 27 (1925-26), pp» 


427-464; pp. 431, 436. 
t E. Hille, loc. cit., p. 435. 


ORTHO-NORMAL POLYNOMIALS 
| H,(x) | < 


it is easy to see that normalized Hermite polynomials possess Property (2). 
8. Generalizations of Paley’s theorems. Let the set of polynomials 
{A,(x)} and the function a(x) have Properties (1) and (2). 
Throughout we shall denote by co, c:, cz, - - - a bounded set of numbers 
such that c,—0 as n— , and by 


ce 2 c# 
the set |co|, |c|, |c2|, - - - rearranged in descending order of magnitude. 


THEOREM III. the series converges where p'=2, then 


(14) W(x) f(x) ~ W(x) n(x) 


n=0 
is of class L,”’, and 
Sy)?’ A + 
n=0 
where A is a constant. 
It is observed that the series in the right member of (14) converges in the 


mean of order 2 and hence represents some function of the Lebesgue class L,; 
for 


< ( ( ) <0, 


n=0 n=0 n=0 


Consider the inequality 


b 


n=0 


for the case p’ =2 it is well known. If Lemmas 1 and 2 applied, it would be 
sufficient to prove the theorem for positive even values of p’. Let us show 
first that these lemmas do apply. 

Let the linear transformation T be by definition 


An 
T{(m + 1)en} = W(x)f(x) ~ W(x) O(n + Deaf = 


let ¥(x) =a(x), and 


1 
(nSu<n+1;n =0,1,2,---). 


1 
+... 


864 M. F, ROSSKOPF [October 


For the bound M*(1/p’, 1/p’) we have 


( + 1) + | 


sup 


Therefore Lemmas 1 and 2 will apply, and it suffices to prove the theorem 
when ?’ is an even integer. 

To fix the ideas take p’ =4; the proof for other even integral values of p’ 
is similar. In this case (15) becomes 


6 
f | |*da(t) ADS | cn|*(m + 1)?. 
a n=0 


Consider the sequence of functions, 
fo(%) = W(x)coAo(x), = W(x)cAi(x), 


2m—1 
Sm(x) = W(x) cnAn(x), m 2 2; 
and let 
2m—1 
= cl, € = em = m2 2. 


Then, if u, v are any two integers such that 0<y <p, 


f BOR 


s f wr] da(t) - max 


n=2 


3/2 


n= 
S < + €,)2-l/2, 


where use has been made of Hélder’s inequality and Properties (1) and (2). 
Since this result is symmetric in yu and », it holds also if u >». 

It follows from the above equation that if m, ms, ms, m, are arbitrary 
positive integers, then 


ljp’ 


ORTHO-NORMAL POLYNOMIALS 


A(€m, + Ems + Em; + €m,) 


Using this result we obtain, 


In the summation over m, the coefficient of an e,, in the above sum is 


m,=1 mg=l m3=1 


it follows that 


b 


m=1 


Now, 


f fe (t)da(t) = co f [W(t)A o(t) |4da(t) Ace! f S = Ae, 


by Property (2); consequently 


(16) f | DX | | | da(t) A Dem = cn |*(m + 1)?. 


m=0 m=0 n=0 


From this we infer that the series 


= W(2) 


1934] 865 
| 

| 
| 
m=O n=O 


866 M. F. ROSSKOPF [October 


converges almost everywhere, but the series }°c,A »(x) converges in the mean 
of order 2 to f(x) as was remarked at the beginning of the proof. Since these 
two limits must be the same, we have finally 


b 
f | |4da(t) < | cn + 1)?. 


It will be noticed that the inequality stated in the theorem was not 
proved but the inequality (15) was proved instead. The only point of the 
proof which depends on 1 is the use of Property (2) ; furthermore this estimate 
does not depend at all on the order in which it is made; hence we can assume 
the c, are already rearranged in decreasing order of magnitude. 


THEOREM IV. Jf 
W(x)f(x) ~ W(x) DicrAn(x) LP, 
n=O 
where 1<pS2, then 
n=O 
where A is a constant. 
In view of our last remark above it suffices to prove 
| cn + S ATP. 
n=0 
Let us write 
dy, = | Cy |?-"(m + sgn Cn; 
then 
cn = | d,|?’—"(m + sgn dy; 


where p’=2, 1/p+1/p’ =1. Now 


N 
| cn + = = dn |?’(m + 
-0 


n=O 


n=0 


gw(x) = 


n=0 


let 


1934] ORTHO-NORMAL POLYNOMIALS 


Using Hélder’s inequality and Theorem III, we have 


N N 
D | cn + = = f p(t) f(t)gn(t)dt 


n=0 n=0 


n=0 


N 1/p’ 


N 1/p’ 


n=0 


therefore 


N 

| cn + SATP(S). 

n=0 
Since A is independent of NV, the theorem follows by making W tend to in- 
finity. 

The form of the expression (16) suggests the stronger inequality of the 
following theorem, the details of whose proof are analogous to those of the 
proof of Paley’st Theorem III. The only change is the introduction of the 
integrator function a(x) and the factor W(x). 


THEOREM V. Let S(x) denote the upper bound 


S(x) = Sup ‘ 


n=0 


Then, if p’>2, 
a n=0 


where A ts a constant. 


+ R. E. A. C. Paley, loc. cit., pp. 232-238. 


Joun Burrovucus ScHooL, 
Crayton, St. Louis County, Mo, 


867 


ON BOUNDED LINEAR FUNCTIONAL OPERATIONS* 


BY 
T. H. HILDEBRANDT 


The set of all bounded linear functional operations on a given vector 
spacet plays an important role in the consideration of linear functional oper- 
ations. For the sake of greater definiteness it is desirable to know the form of 
such operations and a space determined thereby. This problem has been 
solved for a number of spaces, for instance, all continuous functions on a 
bounded closed interval, all Lebesgue pth power (p=1) integrable functions, 
all sequences whose pth powers (p=1) form absolutely convergent series, all 
sequences having a limit, and so on.{ All of these spaces have the property of 
separability. For non-separable spaces, there is a recent determination of the 
operation for the space of all bounded functions on a finite interval, having at 
most discontinuities of the first kind, by H. S. Kaltenborn.§ 

In this paper we give a determination of the linear operation for the space 
of (a) all bounded sequences, (b) all bounded measurable functions, (c) all 
bounded functions on the infinite interval having at most discontinuities of 
the first kind, (d) all bounded continuous functions on the infinite interval, 
(e) all almost bounded functions. With the least upper bound as norm, all of 
these spaces are not separable. 

1. Notations. The integral. We shall denote by 

(a) $ a set of elements p. 

(b) £a real-valued function on 

(c) ¥aset of functions é. 

(d) €aset or class of subsets Z of f, containing the null set and the set §. 


* Presented to the Society, April 7, 1934; received by the editors April 17, 1934. 

t By a linear vector space X, we shall mean a so-called Banach space (see Banach, Théorie des 
Opérations Linéaires, Warsaw, 1932, p. 55) of elements ~ in which there is defined addition, and 
multiplication by constants, subject to the usual laws of algebra, a unique zero, and a distance func- 
tion or norm subject to the condition | S| co] - for all and cp. A linear 
operation L on & transforms X into real numbers and satisfies the condition L(¢:ti+cote) = L(t) 
+caL(t). L is bounded and therefore continuous if there exists an M such that, for all £, | L(¢)| < M]l¢||. 
The smallest possible value for M is the modulus or norm Mz of L. We shall limit ourselves to real- 
valued linear operations since a complex-valued operation is expressible as the sum of two real- 
valued ones. 

t See Banach, Opérations Linéaires, op. 59-72; Hildebrandt, Linear functional transformations 
in general spaces, Bulletin of the American Mathematical Society, vol. 37 (1931), p. 189. 

§ See Bulletin of the American Mathematical Society, vol. 40 (1934). 


868 


BOUNDED LINEAR FUNCTIONAL OPERATIONS 869 


It will be assumed that € is additive and multiplicative, ie., if H, and Ey 
belong to € so do and 

(e) II a finite partition or subdivision of $ into mutually exclusive sets 
E,, ---,£, belonging to G. II, =I, shall mean that every set E® of II, is a 
subset of some set E® of Ip. 

Because of the multiplicative property of € the partitions II satisfy the 
conditions of a range on which the general limit of E. H. Moore-H. L. Smith* 
is definable, i.e., if By is any function defined for all partitions II of B, then 
lim By =b has the following meaning: for every e >0 there exists a II,, such that 
if then |6,—)| <e. 

(f) a(E) a function on &. a(E) is additive if +a(E,E2) =a(E;) 
+a(E:), for every E; and E, of &. a(£) is of bounded variation on § if 
>! @(E;)| is bounded for all II of $, and the least upper bound of this sum, 
which agrees with the limit in the II sense if @ is additive, is the total varia- 
tion of a, V(a), on %. Obviously if a is additive, the boundedness of a on € 
is necessary and sufficient that a be of bounded variation. 

For a bounded function £ and a function a it is possible to define the 
Stieltjes integral S ftda: 


f tda = limg 


where II=£,,---, £,, and p; is any point of Z;. We shall say that ¢ is 
S-integrable relative to a if the limit on the left exists. 

For a bounded function ¢ which is measurable relative to G, in the sense 
that for every c and d the set E[c<£(p) <d] belongs to G, it is possible to 
define the Lebesgue integral Ltda by the Lebesgue process, viz., if (a, ) is 
an interval containing the range of values of £, and a=yo<yi< --- <y,=b 
is any subdivision of (a, b) while y;1§<7: Si, then 


L f tda = lim ma(E,), 


where E;=E[y:1<£(p) Sy], and the limit is taken as the maximum of 
Yi—Yi-1 approaches zero. 

If a is additive and bounded on &, and é is measurable relative to €, then 
obviously Lftda exists. The Sftda exists also in this case and agrees with 
the L-integral. The S-integral may exist even though ~ be not measurable 


* A general theory of limits, American Journal of Mathematics, vol. 44 (1922), p. 103. 

T This is a type of integral suggested by Moore-Smith (loc. cit., p. 114) and considered by Kol- 
mogoroff, Untersuchungen ueber den Integralbegriff, Mathematische Annalen, vol. 103 (1930), pp. 
682 ff. 


870 T. H. HILDEBRANDT [October 


relative to ©. For example if $= [0<p<1], € consists of all finite sets of 
non-overlapping subintervals of open on the left, while a(£) is the length 
of E, then the set of all bounded Riemann integrable functions on $ is ob- 
viously S-integrable with respect to a, but includes functions not measurable 
with respect to €. The same is true to a lesser degree if € is the set of all sub- 
sets of $ having Jordan content and a(£) =cont E.* 

2. Bounded sequences. Let $3 be the set of all positive integers p. Let € 
be the set of all subsets of 8, i.e., EZ is any set of positive integers. II is then 
any division of $ into a finite number of mutually exclusive sets of positive 
integers. At least one set in II will contain an infinite number of elements, but 
they all may. 

Let X¥ be the vector space consisting of all bounded sequences, i.e., of all 
bounded real-valued functions £ on §, with ||£|| the least upper bound of the 
values | £(p)|. Then we have the following 


THEOREM. Any bounded linear operation L on & is expressible in the form 
L(é) = féda, the integral being taken in either the L or S sense, and a being a 
bounded additive function on © whose total variation on § is the modulus of L. 
Conversely every such integral is a linear bounded operation on &. 


Let x(£, ») represent the characteristic function of the set E£, i.e., zero 
for p not on E and unity for p on E. Then if a(£) = L(x(£, p)) it is obviously 
additive and bounded on &. 

Divide by the partition II=£,, - - - , E,. Define 


EM) = De(pdx(Es, 


Then lim || £(II) —¢|| =0. For suppose that the range of values of £(p) is con- 
tained in the interval (a, 6), and divide (a, 6) into m equal parts by the points 
a=Yo<¥1<¥2< -- + <yn=b, so that (6—a)/m is less than a given e. If E; 
is the set E[y:1<£(p) S¥y;] and I], consists of Ei, - - - , E,, then obviously 
|| £(II.) — £|| Se. The same inequality will also hold for any repartition II of I1., 
which demonstrates the assertion. 

Now by the linearity of L, 


By the boundedness of L and the convergence of the right-hand side, it fol- 
lows that 


Le) = f ida, 


* See J. Ridder, Nieuw Archiv der Wiskunde, (2), vol. 15 (1928), pp. 321-9; O. Frink, Annals of 
Mathematics, (2), vol. 34 (1933), pp. 518-527. 


1934 BOUNDED LINEAR FUNCTIONAL OPERATIONS 871 


where it is obvious that since ~ is measurable relative to ©, the integral on the 
left may be defined in either the S or Z sense. The fact that the total variation 
of a is the modulus of L follows from the fact that if 


= Dox(E;, p) sgn a(E,), 


then =1 and L(t) 

The converse theorem follows from obvious properties of the integral. 

It is possible to give the result another form. Suppose 9; is the first integer 
in the set E;. Define the sequence or function =0 if p+ p;, while By (Pp) 
=a(E,) if p=p;. Then the approximating sum }>¢(p,)a(E,) can be written 
> »Bu(p)é(p), where only a finite number of the 8,() are not zero for a given 
II. We can consequently state the following: 


To every linear functional operation L there corresponds a set of sequences 
Bu(p) whose elements are different from zero at most for a finite number of p, 
such that 


limy = L(é); 
Mz 


limy | Bn(p)| = Mz. 


This result parallels a result due to Banach* for separable subspaces of the 
space ¥. While the limit involved in this result can be reduced to a sequential 
limit for each particular ~, a non-sequential limit is needed for the whole 
space. The import of the Banach theorem is that for the case of a separable 
subspace Xp of ¥, there exists a sequence of partitions II,,, such that for every 
of X 0, 


lim, ||€(1,) — = 0 and lim, = L(é). 


It is possible to deduce this result from our general considerations. For 
this purpose we note that if , is any sequence of functions of the space %, 
it is possible to select a sequence of partitions II,, by the diagonal process, such 
that lim, £(II,,) =é for every £,, of the given sequence. If Xo is a separable sub- 
space of ¥ and £, is dense in Xo, then if belongs to Xo, there exists a sequence 
£,,, approaching & Let II,,---,II,,--- be the partitions such that for 
every 

lim, = 


* Opérations Linéuires, p. 72. 


(2) 

and 

(3) 


872 T. H. HILDEBRANDT [October 


Now 
lim — = 0 


uniformly in &, since 
En (x) Ss I] En é|l. 


Hence by the iterated limits theorem on double sequences it follows that 
lim, ||£(I1.) =0. Consequently, for every of Xo, 


lim, = lim, = f = L(g). 


We note that if is a special sequence, it may not be necessary to use all 
of the values of a. For instance if ¢ is a sequence converging to zero, it is 
sufficient to know the values of a(E,) where E, consists of the integer only. 
Obviously in this case L(£) reduces to >f¢(p)a(E,), with >>|a(E,)| 
which is a well known result. Similarly for any sequence converging to a 
limit, the values a(E,) and a($) suffice. 

The effect of the fundamental theorem established is that a conjugate 
space to the space of all bounded sequences is the space of all additive 
bounded functions on subsets of integers, with norm the total variation of the 
function. 

The question naturally arises whether additive functions on the set € 
exist, which are not absolutely additive, i.e., whether this form of operation 
is effective. The instance of sequences approaching a limit must come from 
such a function. Banach’s measure function* on subsets of positive integers 
gives a complete example. 

3. Bounded measurable functions. The procedure in this case is entirely 
analogous to the preceding case. 

We let B= — ~ <p<~, & the set of all measurable subsets of B, X the 
set of all bounded measurable functions ¢ on §, with ||£|| the /east upper 
bound of | £(p)|. Then we have 


THEOREM. Any linear functional operation on the space of all bounded meas- 
urable functions is expressible in the form ftda, where the integral is to be taken in 
either the S or L sense, the function a is additive and bounded on ©, and the 
total variation of a is the modulus Mz, of L. a(E) is the value of L(x(E)), where 
where x(E) is the characteristic function of E. 


Obviously it is possible to give a theorem corresponding to the Banach 
result, viz., that the operation Z is the II-limit of a set of finite sums, each 
involving the function £ at only a finite number of points. 


* Opérations Linéaires, p. 231. 


1934] BOUNDED LINEAR FUNCTIONAL OPERATIONS 873 


4. Bounded functions on the infinite interval having at most discontinui- 
ties of the first kind. This class of-functions is obviously a subclass of the set 
of bounded measurable functions. As a consequence it is to be expected that a 
smaller set € will suffice. Let again $= — 2% <p<o. Then the set € con- 
sists of all subsets E of B, which consist of a finite or infinite number of open 
intervals and single points, there being at most a finite number of intervals 
and individual points in the finite part of the fundamental interval, i.e., a 
set E consists of disjoint open intervals a,<p<b,, together with points p, 
either end points of (a,, b,) or not belonging to any (a,, b,), where @n, Dn, Pn 
have at most and — as limiting points. The intervals — © <p<aand 
a<p<o will be considered open intervals. 

With this definition we have the following 


THEOREM. Every bounded linear operation on the set of all bounded functions 
on —2x <p<© having at most discontinuities of the first kind is expressible 
in the form {fda where the integral is of the S type, a is additive and bounded 
on & and has total variation M _. 


In order to prove this theorem it is sufficient to show that the functions 


E(I1) = Doit(ps)x(Ei, 

approach é uniformly in the II-sense. For this purpose we utilize the theorem 
of Lebesgue* that if £ is bounded and “as only discontinuities of the first kind 
and is limited to a finite interval (a, 6) then there exists for any given e>0 a 
subdivision of (a, 6) into a finite number of intervals such that on each open 
subinterval the oscillation of £ is less than e. It follows that for any given & 
and any e>0, there exists a sequence of points --- p-n<--+ <pi<po<pi 
<-+++<p,< --+-+approaching — © on the left and + on the right, such 
that interior to each interval (p;-1, p;) the oscillation of & is less than e. Sup- 
pose now (c, d) contains the region of variation of £(p), i.e., c<&(p) <d for 
all p. Divide (c, d) into a finite number of equal parts of length eo, by the 
points c=yo<yi< --- <y,=d. Let the set £, consist of all the intervals 
(Pi, Pi+1) containing in their interior a point p such that yo < (p) S91, together 
with all points p; satisfying the same condition. Let £, consist of all the in- 
tervals not belonging to EZ, which contain a point p for which y, <&(p) Syn, 
and the points ; satisfying the same condition, and so on. Then since the 
oscillation of on any of the intervals (;, pi+:) is at most e, it follows that 
the oscillation of £(p) on any E, is at most e+e. Consequently 


— p)|| S + 60, 


* Annales de la Faculté des Sciences de Toulouse, (3), vol. 1 (1909), p. 60. 


874 T. H. HILDEBRANDT [October 


p; being any point of £;. Since the same type of inequality will be valid for 
any partition II=II., where II, consists of E,, ---, En, we have the result 
desired. 

The case in which the infinite interval is replaced by a finite interval has 
been considered by Kaltenborn.* In that case the infinite parts of our parti- 
tions drop away, and it can be shown that the integral depends only on a 
point function of bounded variation and a function zero except at a denumer- 
able set of points, but it is simpler to proceed directly in this case. 

5. Bounded continuous functions on the infinite interval. Obviously the 
class of functions considered in §4 contains the set of bounded continuous 
functions as a subset. As a consequence we can effect a further reduction in 
the set €. We shall assume that € contains all sets E which consist of a finite 
or denumerable set of non-overlapping half open intervals (a, <~<b,), whose 
end points have at most —* and + as limiting points. The intervals 
—« <ps<aand a<p<~© will be considered to be half open intervals. 

With this definition of € we can state the same theorem as in the pre- 
ceding paragraph. It is to be noted, however, that in this case the function 
a(E) is defined in terms of an extension of the linear operation Z on continu- 
ous functions to functions having discontinuities of the first kind. 

If we limit ourselves to a finite interval, the ordinary Stieltjes integral 
applies, since because of the continuity of &, the successive partition limit 
agrees with the limit as the maximum length of subdivisions approaches zero. 

It is possible to give a form to the general theorem which is comparable 
to the Banach result for sequences. Let II be any partition of $ into sets 
E\,: ++, E,. Let p; be any point in the interval of Z; nearest to p=0. Let 
B,(p) bea point function such that 84(0) =0,and constant except at the points 
p = pi, where it has a break or saltus of magnitude a(£;). Then obviously 


where the infinite limits could be replaced by any finite interval containing 
the points fi, - - - , ~, in its interior. it follows that we have the following 
alternative theorem: 

If Lis any bounded linear operation on the class of bounded continuous func- 
tions on —© <p<~, then there exists a set of point functions By(p) constant 
except at a finite number of points such that 


f 


1934] BOUNDED LINEAR FUNCTIONAL OPERATIONS 875 


the integral being an ordinary Stieltjes integral. The functions By are uniformly 
of bounded variation and limy V (Bz) = M1. 


For any separable subset we can obviously proceed as in §2 and replace 
the limit in the II-sense by a sequential limit, i.e., we can find a sequence II, 
of partitions which is effective in the limit for all functions of the set. 

6. Almost bounded measurable functions. In agreement with common 
usage the measurable function is almost bounded if it is bounded except for a 
set of zero measure. The || £|| is defined as the greatest lower bound of positive 
numbers a such that the set E[| £(p)| >a] is of zero measure. 

The only difference between this case and that of §3 is that if E is a set 
of zero measure then a(E£) =0, since then L(x(£)) =0. It cannot however be 
concluded that if a(£) =0 for any set of zero measure then a(£) is absolutely 
continuous and consequently the indefinite integral of a Lebesgue integrable 
function. 

An example on the finite interval 0 < p<1 of an additive bounded func- 
tion a(£) on measurable sets which satisfies the condition that a(£) =0 for 
meas E=0, but is not absolutely continuous nor absolutely additive, can be 
constructed. Let (a,, b,) be a sequence of disjoint intervals whose end points 
have 1 as their only limiting point. If Zo is any measurable subset of (a,, d,), 
then define B(Eo) =mE,/(a,—5,). If now E is any subset of 0<p<1, and 
E,, the part of E lying on (a,, 5,), then B(E,) defines a bounded sequence of 
numbers. The function a(E) = /6(E,)du (in the sense of §2), where u is a 
measure function of Banach on subsets of positive integers,* will be additive 
on measurable subsets of (0, 1), will satisfy the condition 


= 0 if meas E = 0, 


but will not be absolutely additive, nor absolutely continuous. For if E£ 
is the set (1—eSp<1) then for all e>0, a(Z)=1. Incidentally it appears 
that if £(p) is continuous on (0, 1) then [{} da = £(1), i.e., as far as the integra- 
tion of continuous functions is concerned a(£) is equivalent to the function 
=0 for OS p<1, y(1) =1. 

It is obvious that the results of §§3, 5, and 6 can be extended to corre- 
sponding situations in m-dimensional space. Also that it would be possible to 
set up a general theorem reducing to the special cases considered by a proper 
choice of the set 8 and &. 


* Opérations Linéaires, p. 231. 


UNIVERSITY OF MICHIGAN, 
ANN ARBOR, MICH. 


A SET OF FOUR POSTULATES FOR BOOLEAN ALGEBRA 
IN TERMS OF THE “IMPLICATIVE” OPERATION* 


BY 
B. A. BERNSTEIN 


1. Introduction. Whitehead and Russell’s Principia Mathematica makes 
fundamental the notion “ >” of “implication,” defined by 


Df. 


The main object of my paper is to present in terms of this “implicative” oper- 
ationt > a set of four postulates for Boolean algebra. This will secure for 
Boolean algebra, for the first time, a set of postulates expressed in terms of 
an operation other than “rejection” having as few postulates as the present 
minimum sets.{ Of course, by the principle of duality in Boolean algebra, my 
postulates will also be a set in terms of the dual of p> q, namely ~fg. 

I prove for my postulates (a) their consistency, (b) their mutual inde- 
pendence, (c) their sufficiency for Boolean algebra, (d) their necessariness for 
Boolean algebra.§ The consistency and independence systems are all Boolean 


* Presented to the Society, June 19, 1933; received by the editors October 9, 1933. 

t The Principia calls > a relation. 

t For the present minimum sets, see B. A. Bernstein, (I) A set of four independent postulates for 
Boolean algebras, these Transactions, vol. 17 (1916), pp. 50-51; (II) Simplification of the set of four 
postulates for Boolean algebras in terms of rejection, Bulletin of the American Mathematical 
Society, vol. 39 (1933), pp. 783-787. For another set of postulates in terms of 3 , the first set in terms 
of > , see E. V. Huntington, (I) A new set of independent postulates for the algebra of logic, with special 
reference to Whitehead and Russell’s Principia Mathematica, Proceedings of the National Academy of 
Sciences, vol. 18 (1932), pp. 179-180. Huntington’s postulates are eight in number (including an 
inadvertently omitted existence postulate). 

§ I offer (a)—(d) as a set of defining properties of a set of postulates: a system of propositions S 
is a set of postulates for a (consistent) system = if and only if the propositions of S are (a) consistent, 
(b) mutually independent, (c) sufficient for =, (d) necessary for ©. More simply and less formally 
stated, a set of postulates for a system is a set of propositions of the system which cannot be derived 
from one another but from which all the other propositions of the system can be derived. This view 
of postulates is opposed to the view, seemingly held widely, that demands of postulates only sufficiency 
and necessariness (hence also consistency). The latter view would have to accept as a set of postulates 
for euclidean geometry all of Euclid’s Elements, and would violate the generally accepted distinction 
between postulate and theorem. My view of postulates is of course opposed to the view, seemingly held 
by some, that demands of postulates only sufficiency. This view would have to accept S as a set of 
postulates for Z, not only when S is the whole of 2, but also when S is inconsistent (since “a false propo- 
sition implies any proposition”) and when S is only a special case of = (when S, for example, is the 
theory of Abelian groups and = the theory of groups in general). If I am correct in my view, the term 
“independent postulates,” found in the literature, must be understood to mean “postulates whose 
independence has been proved,” and the term “postulates” applied to S when only (c) and (d) have 


876 


POSTULATES FOR BOOLEAN ALGEBRA 877 


and very simple. The proof of sufficiency consists in deriving from the postu- 
lates my second set of postulates in terms of rejection (see my Paper II, 
loc. cit.) ; the proof of necessariness consists in the converse derivation. 

I shall derive from my postulates the “theory of deduction” of the Prin- 
cipia. This will verify the fact, obtained elsewhere* less directly and from 
another point of view, that the theory of deduction is derivable from the gen- 
eral logic of classes. 

There is a close relation between > and the operation “ —” of “exception” 
used by mef, and later by Taylorf, in postulates for Boolean algebra. I shall 
bring out this relation. 

If in a set of independent postulates for the logic of classes there is a 
proposition demanding that the number of elements be at Jeast two, and if 
this proposition be replaced by a proposition demanding that the number of 
elements be just two, then the propositions resulting from the change will be 
sufficient for the logic of propositions as a two-element Boolean algebra. But 
these propositions will, in general, not be independent. I have so chosen my 
postulates that the change in question will render them a set of independent 
postulates for the logic of propositions.§ 

2. The postulates. The postulates with which we are mainly concerned 
have as primitive ideas a class K and a binary operation > , and are the proposi- 
tions A;-A, below.|| In Postulates A, and A;, there is to be understood the 
supposition that the elements involved and their indicated combinations belong 
to K. This must especially be borne in mind when the independence of the 
postulates is considered. The postulates follow. 

PosTULATE A;. a>) is an element of K whenever a and d are elements 
of K. 


been proved for S, must be understood to mean “provisional postulates” for 2. If desired, “provisional 
postulates” might have a distinctive name, say basic propositions of 2, or defining conditions for >. 

* B. A. Bernstein, (III) On section A of Principia Mathematica, Bulletin of the American Mathe- 
matical Society, vol. 39 (1933), pp. 788-792. 

t B. A. Bernstein, (IV) A complete set of postulates for the logic of classes expressed in terms of the 
operation “exception,” and a proof of the independence of a set of postulates due to Del Re, University of 
California Publications in Mathematics, vol. 1, pp. 87-96. 

tJ. S. Taylor, A set of five postulates for Boolean algebras in terms of the operation “exception,” 
University of California Publications in Mathematics, vol. 1, pp. 242-248. 

§ For a discussion of the nature of the logic of propositions, see B. A. Bernstein, (V) Sets of 
postulates for the logic of propositions, these Transactions, vol. 28 (1926), pp. 472-478. 

|| The symbol “=” used in the postulates is taken as an idea outside the system. Compare my 
(VI) Whitehead and Russell’s theory of deduction as a mathematical science, Bulletin of the American 
Mathematical Society, vol. 37 (1931), pp. 480-488. For sets of postulates for Boolean algebra in 
which “=” is taken as an idea within the system, see E. V. Huntington, (II) New sets of independent 
postulates for the algebra of logic, with special reference to Whitehead and Russell’s Principia Mathe- 
matica, these Transactions. vol. 35 (1933), pp. 274-304. 


878 B. A. BERNSTEIN [October 


PosTULATE Ag. (a> Da=a. 
PosTULATE A3. There is an element z in K such that 


(d > d) > [(a> 6) = > > [(6 > 2)]} 


PosTuLaTE A,. K consists of at least two distinct elements. 

3. Consistency and independence of the postulates. The consistency and 
the independence of Postulates A;-A, are given by the following systems 
So-Ss, in which So is the consistency system, and S;, Sz, S3, S, are the inde- 
pendence systems for Aj, As, A3, Ay respectively. The systems are all Boolean. 


System K a>b 


So 0,1 a’+b 
Si 0,1 0/0* 
S2 0,1 0 
Ss 0,1 a 
Ss 0 0 


4. Theorems. The proof of the sufficiency of Postulates A;-A, for Boolean 
algebra, and for the theory of deduction of the Principia, will be effected with 
the help of the following theorems 1a-26a. 


la. (a> 
2a. (a2 
3a. (a> > 2) >a] > > 0) 
4a. Da. 
Sa. (a2>2). 
6a. (a5 2). 
7a. The element z of Postulate A; is unique. 
DEFINITION 1a. a@;=a 52. 
8a. = a, where ay, = (a;);. 
9a. > a) > (b> 
10a. 24a. 
lla. 4. 
12a. 4. 
13a. 
14a. 


* In a two-element Boolean algebra, we may define the quotient precisely as in the case of the 
algebra of number. 


i 


1934] POSTULATES FOR BOOLEAN ALGEBRA 


DEFINITION 2a. u=a ><a. 


15a. Z2a=4. 

16a. uDda=a. 

18a. a@2b=(b24;:) > 
19a. 34. 

20a. (a2 
21a. (a2 b). 
22a. (a2 5). 
23a. a>(b2a)=u. 

24a. 


DEFINITION 3a. a|b=b > a. 
DEFINITION 4a. a’=a|a. 


25a. 


DEFINITION 5a. ~a=4d. 
DEFINITION 6a. avb=a, 35). 


26a. ~wavb=an2b. 


DEFINITION 7a. F}a=(a=u). 


5. Proofs of the theorems. The proofs of the theorems 1a-26a follow. 

Proof of la. a =(a>a)3a=(a>a)>[(a>2) >a] = {[(a>2z)>a]> 
>a) >z]} >z=(a>2) D2, by As, As, As, Ac. 

Proof of 2a. (a>a)>b=(a>a) > [(b>2) = { [(b>2) 2b] > [(2 3d) 
>2z]} >z=(b>2) >z=b, by As, As, As, la. 

Proof of 3a. (a3b) 9c=(d>d) > [(a>b) ac] ={ [(c>2z) >a] > [(b>c) > 
z] } >2, by 2a, As. 

Proof of 4a. (a>2) { [(b>2) >a] > { [(6>2) >a] 
>z}2>2=(b>2) >a, by 3a, Az, la. 

Proof of 5a. a3 b=[(a>z) >z]3b=(b>2) >(a>32), by la, 4a. 

Proof of 6a. a> (632) = [(b>2z) >(a>z) =b>(a>32), by Sa, la. 

Proof of 7a. Suppose that two elements, y and z, have the property of z. 
Then >y] > [(¢>2) >y]} >y={2> [(¢>2) } 
=(z>y) >y=z, by 1a, 3a, 1a, 2a, 1a. 

Proof of 8a. By def. 1a, 1a. 

Proof of 9a. By def. 1a, 3a. 

Proof of 10a. By def. 1a, 4a. 


879 
| a’ = 4. 


880 B. A. BERNSTEIN [October 


Proof of 11a. By def. 1a, 5a. 

Proof of 12a. By def. 1a, 6a. 

Proof of 13a. a; = (a: 32) >a,=[(a>2) 32] a, by Az, def. 1a, 1a. 

Proof of 14a. a2 32] 3b) 92] by 1a, 
2a, 2a, la. 

Proof of 15a. z> a= [(z>a) >z=z>2=u, by 1a, Az, def. 2a. 

Proof of 16a. 41> a=(a>a) >a=a, by def. 2a, As. 

Proof of 17a. 2: =z>2=u, by def. 1a, def. 2a; 32) 
by def. 2a, def. 1a, As. 

Proof of 18a. (b > a:) > a1 = [(a > 5) (a, = 1) = 
(au > by 9a, 12a, 2a, 8a. 

Proof of 19a. =(a3 35, by 18a, 8a. 

Proof of 20a. (a>6) = [(a>6) 3b] 36 = (b, 2a) 3b = (a, 3b) 35 = 
b, >a,=a2>), by 19a, 19a, 10a, 19a, 11a. 

Proof of 21a. a>(a>6) =[(a>b) 9a] [(an >a) 2 (63a): 
[(a>a) 3 = (63 2a = by 18a, Ya, 8a, 
2a, 8a, 18a. 

Proof of 22a. (63a) > (a>b) = { 3b] > [a> (a>) 
2b] by 9a, 21a, As, 8a. 

Proof of 23a. a>(b>a) =[(b> a) a,=[(an 3b) > 
[(au > b) >a]; by 18a, 9a, 13a, 8a, As, 
def. 2a. 

Proof of 24a. (b>a)>(b>c) = 3b] > [a> = { [i> 
(b>c)|> [ar(brdh}i = = {ur [ar(b> 
c) ji }i=[a>(b>c)]u=a>(b>0), by 9a, 10a, 11a, 23a, 16a, 8a. 

Proof of 25a. a’ =a| a@=a>4,=4, by def. 4a, def. 3a, 13a. 

Proof of 26a. ~avb=a,vb=a,3b=a55, by def. 5a, def. 6a, 8a. 

6. Sufficiency of the postulates. I shall now prove the sufficiency of post- 
ulates A,-A, by deriving from them my second set of postulates for Boolean 
algebra in terms of rejection.* This set has as primitive ideas K and “|,” and 
as postulates the propositions B,-B, following (in postulates B; and B, there 
is to be understood the supposition that the elements involved and their indi- 
cated combinations belong to K). 

B,. K contains at least two distinct elements. 

B:. If a and b are elements of K, a| b is an element of K. 

DEFINITION 1b. a’=a|a. 


Bs. a = (b| a)| a). 
By. a| (b| c) = [(c’| a)| 


* See my Paper II, loc. cit. 


1934] POSTULATES FOR BOOLEAN ALGEBRA 


The derivations of B,-B, from A;-A, follow. 

Proof of B:. By Ax. 

Proof of Be. By def. 3a, def. 1a, Ai. 

Proof of B;. = [ (b b) [ (Qu > b) =] (b 2a): = (a>b) 
(b> a): = (b> a;) > (ab): = (b a1) (b1 = a) | a) = a) | |), 
by 8a, 2a, 9a, 8a, 12a, 11a, def. 3a, 25a. 

Proof of By. (634) =[(an 3) 3 (4 >a): h= 
[(b1 > > (> = a1) as): = [(C > ]’ = [(C'| 
| (b’|a)]’, by def. 3a, 12a, 9a, 10a, 12a, 25a, def. 3a. 

7. Necessariness of the postulates. I shall prove that Ai-A, are necessary 
for Boolean algebra by deriving Ai-A,; from the rejection postulates B,-B, 
above. For this derivation I shall use as auxiliary theorems propositions 
1b-8b following, derivable from B,-By. 


1b. a’ =a, where a” = (a’)’. 
2b. a|b=b|a. 

3b. a| (b| b’) =a’. 

4b. (a| c)| (| = [cl | oY. 

5b. ala’ 


DEFINITION 2b. u=ala’. 
6b. a| u 
7b. a| u’ 


DEFINITION 3b. a>b=a|b’. 


8b. aru 


Propositions 1b, 2b, 3b, 5b are respectively Sheffer’s Postulate 3, Theo- 
rem A, Postulate 4, Theorem B.* The proofs of 4b, 6b, 7b, and 8b follow. 

Proof of 4b. [cl =(6|0)| (a|c), by Ba 1b. 

Proof of 6b. a|~u=al (a| a’) =a’, by def. 2b, 3b. 

Proof of 7b. a|u’ = [(a|u’)’]’= [(a|w’) | (a|a’)]’= [(u’|a)|(a’|a)]’= 
[a| (u’’|a’’)]’’ =a| (u| a) =a| (a| x) =a| a’ =u, by 1b, 3b, 2b, 4b, 1b, 2b, 6b, 
def. 2b. 

Proof of 8b. a> “’=a|u"’=a|u=a’, by def. 3b, 1b, 6b. 

The derivations of A:-A, from B,-B, now follow. 

Proof of A;. By def. 3b, def. 1b, Bo. 

Proof of Az. >a=(a| b’)| a’ b’)| [a| (b| ] =(0’| a)| [(0’| 


* See H. M. Sheffer, A set of five independent postulates for Boolean algebras, with application to 
logical constants, these Transactions, vol. 14 (1913), pp. 481-488. 


881 
=a’, 
=u. 


882 B. A. BERNSTEIN [October 


= {a| [b’"| b)’]}’= [b| }’= }’=[a| (| a’) 
(a| u)’=a’’ =a, by def. 3b, 3b, 2b, 4b, 1b, 2b, def. 2b, 7b, 6b, 1b. 

Proof of A;. The element wu’ will serve as the required element z. For, 
{[(c>u’) >a] > [(b>c) } au’ = [(c’ 2a) =[(c’|a’)| (b|c’)’’]’ 
| (a| d’) = [c’| (al 6’)}’| (d|d’) =(a| d’)| |’ = > [(a>8) Dc], by 
8b, def. 3b, 1b, 2b, 4b, 3b, 1b, 2b, def. 3b. 

Proof of A;. By B:. 

8. Derivation of the theory of deduction. I now come to the derivation 
from A,-A, of the theory of deduction of Principia Mathematica. The primi- 
tive ideas of this theory are a class K, a unary operation “~,” a binary opera- 
tion “v ,” and a notion “Ft,” which may perhaps be termed a predicative rela- 
tion. The postulates of the theory are the propositions C,-C; below. These 
postulates are expressed in terms of K, ~, v, +, and an operation “>” de- 
fined by 

DEFINITION Ic. a9b=~avb. 

By 26a, the “>” of Definition 1c is seen to be the same as the “>” of 
postulates A,-A,. This fact will be used hereafter without further mention. 
The postulates C,-C; follow. 

C,[*1-1]. If ka and then Fd. 

C,[*1-2]. [(ava) >a]. 

C;[*1-3]. [a> (bva) J. 

C,[*1-4]. [(avd) > (bva)]}. 

C;[*1-6]. { > [(cva) (cvd)] }. 

C,[*1-7]. Ifaisin K, then ~ais in K. 

C,[*1-71]. If a and bare in K, then av bisin K. 

The derivations of C,-C; from A,-A, follow. 

Proof of C;. Let ka and k(a>6). Then a=u and a>b=u, by def. 7a. 
Hence u > b=. Hence b =u, by 16a; hence F 3, by def. 7a. 

Proof of C2. (ava) >a=(a, 3a) Da=a,34,=4u, by def. 6a, 19a, def. 2a. 
Hence the theorem, by def. 7a. 

Proof of C;.a>(bva)=a>(b, >a) =u, by def. 6a, 23a. Hence the theo- 
rem, by def. 7a. 

Proof of Cy. > (bva) =(a, 36) =(a, 9b) 3b) =u, by 
def. 6a, 10a, def. 2a. Hence the theorem, by def. 7a. 

Proof of C;. (a>6) > [(cva) > (cvb)] = > [(a 2a) 3(4 = 


+ The numbers associated with C;-C; are those of the Principia. For the form of *1-1, see my 
(VII) Remarks on propositions *1-1 and *3-35 of Principia Mathematica, Bulletin of the American 
Mathematical Society, vol. 39 (1933),pp. 111-114. The Principia proposition *1 - 5 has been omitted, 
since *1-5 is redundant (see P. Bernays, Mathematische Zeitschrift, vol. 25 (1926), pp. 305-320). 


1934) POSTULATES FOR BOOLEAN ALGEBRA 883 


(a>b) > [a> =b> [a> (a. = { [a> (1 = { (bu >a) 
>[(ci > 6) Dh ={ (62a) ={( >a) 
=u, by def. 6a, 24a, 24a, 18a, 9a, 8a, 10a, As, 8a, Az, def. 2a. Hence the 
theorem, by def. 7a. 

Proof of Cs. By def. 5a, def. 1a, Ai. 

Proof of C;. By def. 6a, def. 1a, Aj. 

9. Relation between the implicative operation and the operation excep- 
tion. I shall now bring out the relation existing between the implicative 
operation > and the operation “—” of “exception.” 

The considerations are simple. The element a—d is, in the usual Boolean 
notation, the element ab’. Since a> is the element a’+4, the elements a>) 
and b—a are the duals of each other. Hence, a postulate-set in terms of > is es- 
sentially also a set in terms of “—,” and vice versa. 

Let me actually transform Postulates A,-A, into a set in terms of “—.” 
To do this, it will be convenient to re-letter the formulas in A;-A,. If we write 
b—a for a>b, z for u (the dual of z), and re-letter, Postulates A;-A, become 
the following postulates D,-D, in terms of “exception.” 

D,. a—b is an element of K whenever a and b are elements of K. 

Dz. a—(b—a) =a. 

D;. There is an element u in K such that 

[a—(6-c)]-(d—d) =u— { [u—(a—b) ]—[c—(u—a) ]}. 

D,. K consists of at least two distinct elements. 

To actually transform a set of “exception” postulates into a set of “im- 
plication” postulates, let me take a set due to Taylor.* This set consists of the 
postulates E,-E; following.t 

E,. K contains at least two distinct elements. 

Es. If a and } are elements of K, a—b is an element of K. 

E;. a—(b—6) =a. 

E,. There exists a unique element u in K such that a—(u—b) =b—(u—a). 

DEFINITION le. a; =u—a. 

E;. a—(b—c) = [(a—b); 

If we replace a—b by b><a and wu by 2g, and re-letter, E,-E; become the 
following postulates F,-F; in terms of implication. 

F,. K contains at least two distinct elements. 

F,. If a and } are elements of K, a>) is an element of K. 

F;. (a3a) >b=b. 


* J. S. Taylor, loc. cit. 
t In Es, E,, E; there is to be understood the supposition that the elements involved and their indi- 
cated combinations belong to K. In Es there is to be understood the further supposition that E, holds. 


884 B. A. BERNSTEIN 


F,. There exists a unique element z such that 9b =(b52) D4. 

DEFINITION If. a, =a 

Fs. (a@36) 9c=[(a, 3c) 30): 

10. Postulates for the logic of propositions. I take up finally the last 
item of my paper: to show that a simple change in one of the postulates 
A,-A, will make these postulates a set of independent postulates for the logic 
of propositions as a two-element Boolean algebra. The change consists in re- 
placing Postulate A, by Postulate Aj following: 

PostuLaTE Aj. K consists of two distinct elements. 

That A, As, As, A/ are necessary and sufficient for a two-element Boolean 
algebra, is obvious. That A;, As, As, Aj are mutually independent, is seen 
from the table of §3: in that table each of the independence systems for Ai, Ao, 
A; consists of only two elements.* 


* In the same way, and for the same reasons, one can transform into independent postulate- 
sets for the logic of propositions my two sets for Boolean algebra in terms of rejection (see my Papers 
I, II, loc. cit.). A like remark applies to Huntington’s first set of postulates for Boolean algebra (E. V. 
Huntington, (III) Sets of independent postulates for the algebra of logic, these Transactions, vol. 5 
(1904), pp. 288-309). 


UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


ON NORMAL KUMMER FIELDS OVER A 
NON-MODULAR FIELD* 


BY 
A. ADRIAN ALBERT 


1. Let F be any non-modular field, » an odd prime, {+1 a pth root of 
unity. Suppose that u in F(f) is not the pth power of any quantity of F(f) 
so that the equation y?= is irreducible in F(¢). Then the field F(y, £) is 
called a Kummert field over F. 

In the present paper we shall give a formal construction of all normal 
Kummer fields over F. This is equivalent to a construction of all fields F(x) 
of degree p over F such that F(x, £) is cyclic of degree p over F(¢). In par- 
ticular we provide a construction of all cyclic fields of degree p over F. 

We shall also apply the cyclic case to prove that a normal division algebra 
D of degree p over F is cyclic if and only if D contains a quantity y not in F 
such that y?=y in F. 

2. The equation 


is irreducible in the field R of all rational numbers and has all the primitive 
pth roots of unity as roots. If F is any non-modular field, then g(¢) has an 
irreducible factor h(€) =0 in F and with ¢ as a root. The roots of h(¢) =0 are 
all powers of ¢ and hence are in a sub-field Z of R(¢). But then the coefficients 
of h(€) =0 are in L so that the group of k(¢) with respect to F is its group with 
respect to L. This latter group is the group of all the automorphisms of the 
cyclic field R(¢) leaving the quantities of L invariant and is a sub-group of 
the group of R(). Every sub-group of a cyclic group is cyclic, so that h(¢) =0 
has a cyclic group generated by 

T: 

where # is an integer belonging to the degree of h(¢) =0, >=1 (mod p). We 
may write 


(1) = fan = = 5" 


so that we have 


* This paper is a revision and amplification of the paper On cyclic equations of prime degree, 
which I presented to the Society on December 27, 1933; it was received by the editors March 17, 1934. 

t If F is the field of all rational numbers, then F(y, ¢) is the ordinary Kummer field of modern 
arithmetic. Our work is a generalization to any non-modular field of that special case. 


885 


CC (k =1,---,m), 


A. A. ALBERT [October 


(2) = te (mod p), 1s%< 


Then T is equivalent to the cyclic substitution (1, f2, - - - , &,) on the roots 
of h(é) =0. 

If \ and uw are any two quantities of K =F (¢) we say that \ is p-equal to u 
and write 


(3) A= up. 
(p) 


H. Hasse* has then given a purely algebraic proof of 


Lemma 1. Jf 


y? =p 1, 


then Z = K(y) is cyclic of prime degree p over K and with generating automor- 
phism 


S: yo Sy. 
Conversely every cyclic field Z of degree p over K is equal to a field K(y), 


y? = pH. 
(p) 


Moreover if also Z=K(z), 2? =y' in K, then 


so that z=y* where d is in K. 


3. We now assume that Z is any normal field of degree pn over F contain- 
ing K =F(f) of degree m over F. Then K is the set of all quantities of Z un- 
altered by a cyclic sub-group H of Z of order p and Z is cyclic of degree p over 
K. By Lemma 1, Z=F(y, in K and H=(I, S, - - - , where S 
is given above. We can then decompose the group G of Z relative to H and 
write G=H+Ho,+ - - - Then J, - - - , on-1 carry to the other 
roots of the irreducible equation h(é)=0. In pacticular one o;=7 carries 
to 

We let T =r” so that T also carries ¢ to ¢* since #?=¢ (mod p). Then 7" 
leaves ¢ unaltered and is in H. Hencer® =S’, T7"=S” =I. 

The group G now has the decomposition G=H+HT+ --- +HT*". 
For otherwise 7"=S‘T’ where n>r>j so that T’-? =S‘ leaves ¢ unaltered, 
which is impossible. We have proved that 


* Bericht tiber Klassenkorper, Jahresbericht der Deutschen Mathematiker-Vereinigung, vol. 36 
(1927), pp. 232-311, p. 262. 


886 


NORMAL KUMMER FIELDS 887 
G=(S‘T) (i=0,1,---,p—1;f =0,1,---,n—1). 
The group G has a cyclic sub-group (T’) of order m and hence Z has a 
sub-field F(x) of degree » over F. Moreover 
y™ (A in K). 
For y‘” in Z evidently generates K(y) and we may apply Lemma 1. But 
(4) yTS) = = = 


where (mod so that e=rt"-! (mod p). Hence TS =S*T. Conversely 
if 7S =S*T then r=et (mod ) is determined and we have proved* 


THEOREM 1. Let F(x) have degree p over F and F(x, £)=Z be normal over F. 
Then Z has the group 


(5) (§=0,1,---,p—1;f =0,1,---,n— 1), 
such that S?°=T" =I, the identity automorphism, and 

(6) TS = S'T (0<e< >). 
Moreover Z=F(y, where y?=y in F(§), 

and r=et (mod ). 


Conversely every normal field Z>F(£) of degree p* over K =F (£) is generated 
as a field Z=F(y, £), in such that 


(8) =u" (lor<p). 
(p) (p) 


The group of Z is then given by (5), (6), (7) where e is determined by r=et (mod 
p) and Z contains a sub-field F(x) of degree p over F, the field of all quantities of 
Z unaltered by the automorphism T. 


It is evident that F(x) is uniquely determined in the sense of equivalence 
and is generated by any quantity 


p—1 
(9) x= = 


t=0 


for which at least one a;~0 for i>0. Moreover the equation 
(10) $(n) = (n — x)(n — -- — 


has coefficients in F, is irreducible in F, and has x as a root. Hence Theorem 1 


* A similar result was obtained by Hilbert for the case F=R. 


888 A. A. ALBERT [October 


gives a formal construction of all fields F(x) of degree p over F with the prop- 
erty that F(x, ¢) is normal over F in terms of the construction of all quan- 
tities satisfying (8). 

If in particular F(y, ¢) has an abelian group, then F(y, ¢)=F(x) XF(f), 
where F(x) is cyclic over F. Conversely if F(x) is cyclic over F, then F(x) 
< F(t) =F(y, ¢) has an abelian group, e=1, r=¢ and we have 


THEOREM 2. Let u range over all quantities of F(¢) such that 
(11) 
(p) (9) 


Then Z=F(x)XF(f) where F(x) is cyclic of degree p over F. Conversely every 
cyclic field F(x) of degree p over F is the uniquely defined sub-field of such an 
F(u?, 

4. We proceed now to the construction of the quantities u. The condition 


is evidently an irreducibility condition depending intrinsically on F itself and 
so must remain in our final conditions. We first prove 


Lema 2. The integer r satisfies the congruence 
(12) 


For 


if = then = 
(p) (p) 


and hence 
— 
(p) 


But then if y>=y the quantity y"-!=dy* where r*>—1=s (mod p), OSs<p 
and ) is in F(¢). But y*? is then in F(¢) so that s=0. 
We have observed that 0<r<p so that there exists an integer p such that 


(13) pr=1 (mod 9). 
We define 
(14) px = (mod <p, 


for all integer values of k, where p»;1=p1=1, and p~*, a>0, is to be defined 
as a corresponding positive power of p. Then 


(15) 1pk = Pr-1 (mod p). 


1934] NORMAL KUMMER FIELDS 


We may then prove 
Lema 3. Let d be any quantity of F(¢) and define 
(16) w= 


Then 
(17) = = 


For the automorphism T carrying ¢ to {' carries each ¢; to [%4:. Hence 


(18) = = 


k=l k=l 


while, by (15), 


= [JAG = 
(p) 


k=l 


as desired. 
Let now 


(p) (p) 


Then define 


(19) M = TIA) 


where A Then A(¢,) so that 


(20) = = p 
(p) (p) 


and hence 


(21) M = 
(p) 


But x is not divisible by p so that z= y” generates K(y), 


2? = M. 
(p) 


Hence F(y, £) =F(w, ¢) where w?=M is a quantity of the form (16). Con- 
versely if u has the form (16) and 


(?) 


889 
k=l 


890 A. A. ALBERT [October 


then F(y, ¢), y? =u, is normal of degree np over F. We have proved 


THEOREM 3. Let \ range over all quantities of F() such that 


(22) y? =p = [I 

k=l (p) 
Then F(y, ¢) is a normal field of Theorem 1. Conversely every normal field of 
Theorem 1 is generated by a u defined by (22). 


We have now succeeded in giving a formal construction of all the fields 
of Theorem 1. In particular we have constructed all cyclic fields of prime 
degree over F. For this case we have pt = 1 (mod ), and may state 


THEOREM 4. Let p,=t?-* (mod p) so that tp, =p,_1 (mod p) and 
let \ range over all quantities of F({) such that 


(23) a = 


kal 
is not the pth power of any quantity b of F(¢). Then if 
(24) 2? = 4, 
the field F(z, ¢) is cyclic of degree np over F and 
F(z) = F(x) X F($), 
where F(x) is cyclic of degree p over F. Conversely every cyclic field F(x) of 
degree p over F is generated as the uniquely defined sub-field of such an F(z, {). 


We have thus given a construction of all cyclic fields of prime degree over 
any non-modular field F where the condition a0” is the irreducibility con- 
dition. 

5. On normal division algebras of degree p. Let Z be a cyclic field of 
degree p over F so that every automorphism of Z is a power of an auto- 
morphism S given by z——>2° for every z and corresponding 2° of Z. Define an 
algebra D whose quantities have the form 


(25) oo, (z; in Z), 


i=0 


such that 
(26) yiz = yP=y 


Then D is a cyclic algebra over F and is a normal division algebra if and only 


1934] NORMAL KUMMER FIELDS 891 


if y+ N(z) for any z in Z. Evidently D is uniquely defined by Z, S, y and we 
write 

(27) D = (,S, v7) = Z, S, 6), 6 = N(c)y 

for any c of Z. For y is replaced by 6 when we replace y by cy. Also* 

(28) (Z,S, 7) X (Z, S, 6) ~ S, v6). 


If D is a cyclic normal division algebra of degree p over F, then D has the 
above form and hence contains a sub-field F(y), vy? = (vy) in F. 

Conversely, let D be any normal division algebra of degree p over F with 
F(x), x? =8 in F as sub-field. Let K = F(£) of degree n over F. The algebra 


(29) M = (K, T, 1), 


a cyclic algebra of degree over F, is a total matric algebra. We form the 
direct product M XD which evidently contains K X D = Dy as sub-algebra. Al- 
gebra D, is a normal division algebra of degree p over K and has the cyclic 
sub-field Z = K(x). Moreover 


(30) Do = Z, S, ¥)s 
where ¥ is in K and the automorphism S is given by the transformation 
(31) yx = BS = ox. 


Let M have a basis (i, k=0, 1, - - - ,) such that7"=1. Then in DKM 
we have 


(32) j(yx)j-! = yru = = S'xyz, 


where yr=jyj~! is in DXM. But y is commutative with ¢ since y is in Do. 
Also yf ={y implies that yrf*={‘yr and hence yz is also commutative with 
¢. For F(¢*) =F (¢). The algebra of all quantities of DX M commutative with 
¢ is evidently Do so that yz is in Dp. 

Since yrx =f‘xyr while y'x ={‘xy', we then have yr=dy‘ where d is in Z. 
Then 


(33) (yr)? = = = 
where N(d) is the norm of the quantity d of the cyclic field Z. But 
(34) Di ~ @,S, = Z, S, 
by (33), (27). 
* If A is any normal simple algebra, then A = M XD, where the total matric algebra M and the 


normal division algebra D are uniquely determined in the sense of equivalence. If A and B are two 
normal simple algebras with the same D, we say that A and B are similar, and write A~B. 


892 A. A. ALBERT 


By applying (34) we have Do"~(Z, S, y(¢")), and hence 
~ (Z, S, v(b«)), 
from which, if u=)-pit, =n +Ap by (25), 
Do ~ Do ~ (Z, S, «), 


where 


If D is any normal simple algebra of prime degree p over F, and K is a 
field of degree » not divisible by ~, then D is a total matric algebra if and 
only if DXK over K is a total matric algebra. Moreover, if r is prime to p, 
then D is total matric if and only if D is total matric. Hence, if Dox=DXK 
and D,’ is a total matric algebra, then so is D. 

Algebra Do” is a normal division algebra since D is a normal division 
algebra. Hence a# N(c) for any c of Z. In particular ab? for any b of K. 
Thus Do contains a cyclic field* W of prime degree p over F. But then 
Do" XW’ over W’ =Wg, the composite of W and K, is a total matric alge- 
bra. Hence D)XW’ is a total matric algebra and so must be DXW over 
W, W=W. But then D has a sub-field equivalent to W and is cyclic. 


THEOREM 5. A normal division algebra D of prime degree p over F is cyclic 
if and only if D has a sub-field F(x), x” =¥ in F. 


* The cyclic sub-field of F(a'/”) defined by Theorem 4. 


Tue INSTITUTE FOR ADVANCED STupy, 
PRINCETON, N. J. 


k=l 


CORRECTION TO A PAPER ON THE WHITEHEAD- 
HUNTINGTON POSTULATES 


BY 
A. H. DIAMOND 


Professor E. W. Chittenden has called to my attention an error which 
occurs in the last footnote on page 940 of my paper in volume 35 of these 
Transactions. It should read as follows: 

The largest number hitherto published is 2*=256. See Dorothy McCoy, 
The complete existential theory of eight fundamental properties of topological 
spaces, Tdhoku Mathematical Journal, vol. 33 (1931), pp. 88-116. 2°=64 
propositions occur in a paper of B. A. Bernstein, The complete existential 
theory of Hurwitz’s postulates for abelian groups and fields, etc. 


qo 


3 
. 


